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Cerebellum and Embryo Specific Protein 



Background of the Invention 

Field of the Invention 



The present invention relates to a novel endothelial factor More 
specifically, isolated nucleic acid molecules are provided encoding a human 
cerebellum and embryo specific protein. Cerebellum and embryo specific 
polypeptides are also provided, as are vectors, host cells and recombinant methods 
for producing the same. Also provided are diagnostic and therapeutic methods 
relatmg to cerebellum and embryo specific protein-related disorders 



RelatedArt 



Myocardial necrosis results from occlusion of a coronary artery by a 
thrombus, which forms on a destabilized atherosclerotic plaque, often following 
Plaque rupture. P, aque s most prone to rupture and thrombosis may initially be 
only mUdly stenotic (i.e., 50% to 60% stenotic). However, myocardial damage 
proceeds rapidly as a "wave front" of injury, moving from endocardium to 
ep.card.um and may become complete and irreversible within three to four hours 
unless the infarct zone is adequately nourished by collateral blood supply or unless 
recanalizationofthearte^ (i.e., revascularization) is accomplished. See Rogers 
WJ.,Am.J.Afed 99: 195-206 (1995). However, collateral circulation typical* 
doesn't develop until a severe coronary artery stenosis has already developed 
(Schaper, W., European Heart J. 16:66-68 (1995)). 

In one model of coronary angiogenesis, vascular formation occurs through 
three major stages including 1) vessel dilation and endothelial cell activation- 2) 
formats of a new vascular channel; and 3) maturation of the new vessel and final 
differentiation of all vascular cells (Rakusan, K., Coronary Angiogenesis: From 

Morphomet^toMole^arBiologyandBacl,m.Claycomb,W.CandDiNardo 
K^ArnNevYorkAcadSci. 752:257-266 (,995)). Agents which promote' 
angtogenesis, and particularly coronary artery angiogenesis, are therapeutically 
valuable to patients afflicted with vascular disease, and particularly heart disease 
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Such agents promote the formation of collateral circulation and ameliorate the 
pathological effects of coronary artery occlusion. 

Percutaneous transluminal coronary angioplasty (PTCA) is commonly used 
revascularization treatment for coronary artery occlusion and myocardial necrosis. 
However, coronary artery luminal narrowing (restenosis) after PTCA is an 
unfortunate complication which occurs in many patients (Rensing, B. J. et ai, 
Circulation «?:975-985 (1993)). There remains a need for therapeutic agents 
which can be used to prevent and treat restenosis. 



Summary of the Invention 

The present invention provides isolated nucleic acid molecules comprising 
a polynucleotide encoding the cerebellum and embryo specific protein (hereinafter 
"CESP") having the amino acid sequence shown in SEQ ID NO:2 or the amino 
acid sequence encoded by the cDNA clone deposited in a bacterial host as ATCC 
Deposit Number 97728 on September 23, 1996. 

The present invention also relates to recombinant vectors, which include 
the isolated nucleic acid molecules of the present invention, and to host cells 
containing the recombinant vectors, as well as to methods of making such vectors 
and host cells and for using them for production of CESP polypeptides or peptides 
by recombinant techniques. 

The invention further provides an isolated CESP polypeptide having an 
amino acid sequence encoded by a polynucleotide described herein. 

For a number of CESP-related disorders, it is believed that significantly 
higher or lower levels of CESP gene expression can be detected in certain tissues 
(e.g., heart, renal tubule, renal glomerulus, vascular endothelium, and aortic 
endothelium) or bodily fluids (e.g., blood, serum, plasma, urine, synovial fluid or 
spinal fluid, and amniotic fluid) taken from an individual having such a disorder, 
relative to a "standard" CESP gene expression level, i.e., the CESP expression 
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level in tissue or bodily fluids from an individual not having the CESP-related 
disorder. Thus, the invention provides a diagnostic method useful during diagnosis 
of a CESP-related disorder, which involves: (a) assaying CESP gene expression 
level in cells or body fluid of an individual; (b) comparing the CESP gene 
expression level with a standard CESP gene expression level, whereby an increase 
or decrease in the assayed CESP gene expression level compared to the standard 
expression level is indicative of a CESP-related disorder. 

An additional aspect of the invention is related to a method for treating an 
individual in need of an increased level of CESP activity in the body comprising 
administering to such an individual a composition comprising a therapeutically 
effective amount of an isolated CESP polypeptide of the invention or an agonist 



thereof. 



Brief Description of the Figures 

Figures 1A-1C show the nucleotide (SEQ ID NO. l) and deduced amino 
acid (SEQ ID NO:2) sequences of CESP. The protein has a leader sequence of 
about 21 amino acid residues (underlined) and a deduced molecular weight of 
about 38 kDa. The predicted amino acid sequence of the mature CESP protein 
is also shown. 

Figure 2 shows the regions of similarity between the amino acid sequences 
of the CESP protein and a chicken gene for which the function is unknown 
(Genbank accession number D2631 1; SEQ ID NO:3). 

Figure 3 shows an analysis of the CESP amino acid sequence. Alpha, beta, 
turn and coil regions; hydrophilicity and hydrophobicity; amphipathic regions,' 
flexible regions; antigenic index and surface probability are shown. In the 
"Antigenic Index - Jameson-Wolf graph, amino acid residues about 20 to about 
86, about 92 to about 126, about 135 to about 157, about 169 to about 190 about 
195 to about 219, about 234 to about 250, about 255 to about 274, and about 288 
to about 336 in Figure 1 correspond to the shown highly antigenic regions of the 
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Z r T ^^^^"^e , correspond 
« the *™g «^ rBpective , y „ se q m N0;2: 

*»-. "about 65, about 7, ,„ abou , I0S , aboul ,„ toabout ^ 
• about ,6P, , „ „ abou , , 98 abou( 2]j (o ^ ■ « 

about 253, aod about 267 to about 3,5. 

Detailed Description 

^P^taventionprovidesi^ted^dcacidmo^ao^ 
a poiynucieottde encoding . CESP poiypeptide having tba aouuo acid JuJ 
shown in SEQ ID NO 2 The (TW ™ . • <• ,. sequence 
V iNU.2. The CESP protein of the present invention shares 
sequence hotnoiogv « a ahietcen gene for „hi ch lhe ^ . 
(Ftgure 2; SEQ m NO:3) (Gm b«* accession number D263 , ,) 

of r-pcrT" 0 "** Se " ,e " Ce " SEQ m N ° :2 ™ **"*' *» *= ^ance 
Of CESP cDNA Cone HHFHG78. 7*0 rmCeutide ^ ^ „ ^ ' 

NO:, was *w by sequencing the ^ ^ 
September23, .P^atthe A^eHeanTypeCuiPareCoiieetion, ,230, Lji 
Dnve, RockvtUe, Marytand 20852, and given accession number 97728 h «, 
done, the CESP sequence is conned between EcoR , and Xbo , sites J£ 
PoWnkerofhepBbteacriptSKMpWdcs,^^^ ^ 

AWfe/c Acid Molecules 



W. t otherwise indicated, a,, nuclide aeuuencea determined by 
-quenctng a DNA tnoUCe herein were denned using an a „ t0 ma,eu DNA 
— (arch as me Mode, 373 from Apphed Inc ,, „ amiM) 

w^ s rrdt po, ~ ^ by dna — 

-e predtcKd by Nation of . DNA seq „e„ce denned as above 
Wor, aa ,s know, in the art for any DNA sequent declined by tbia 
automated approach auy „„ cleotide ^, ^ 
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some errors. Nucleotide sequences determined by automation are typically at least 
about 90% identical, more typically at least about 95% to at least about 99.9% 
identical to the actual nucleotide sequence of the sequenced DNA molecule. The 
actual sequence can be more precisely determined by other approaches including 
manual DNA sequencing methods well known in the art. As is also known in the 
art, a single insertion or deletion in a determined nucleotide sequence, compared 
to the actual sequence, will cause a frame shift in translation of the nucleotide 
sequence such that the predicted amino acid sequence encoded by a determined 
nucleotide sequence will be completely different from the amino acid sequence 
actually encoded by the sequenced DNA molecule, beginning at the point of such 
an insertion or deletion. 

Using the information provided herein, such as the nucleotide sequence in 
SEQ ID NO: 1, a nucleic acid molecule of the present invention encoding a CESP 
polypeptide may be obtained using standard cloning and screening procedures, 
such as those for cloning cDNAs using mRNA as starting material. Illustrative of 
the invention, the nucleic acid molecule described in SEQ ID NO:l was 
discovered in a cDNA library derived from human fetal heart. The determined 
nucleotide sequence of the CESP cDNA of SEQ ID NO:l contains an open 
reading frame encoding a protein of 350 amino acid residues, and a deduced 
molecular weight of about 38 kDa. The CESP protein shown in SEQ ID NO:2 
is about 58% identical and about 74% similar to an unknown chicken gene 
(Genbank accession number D263 1 1) (Figure 2; SEQ ID NO:3). 

The present invention also provides the mature form(s) of the CESP 
protein of the present invention. According to the signal hypothesis, proteins 
secreted by mammalian cells have a signal or secretory leader sequence which is 
cleaved from the mature protein once export of the growing protein chain across 
the rough endoplasmic reticulum has been initiated. Most mammalian cells and 
even insect cells cleave secreted proteins with the same specificity. However, in 
some cases, cleavage of a secreted protein is not entirely uniform, which results 
in two or more mature species on the protein. Further, it has long been known 
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that the cleavage specificity of a secreted protein is ultimately determined by the 
pnmary structure of the complete protein, that is, it is inherent in the amino acid 
sequence of the polypeptide. Therefore, the present invention provides a 
nucleotide sequence encoding the mature CESP polypeptides having the amino 
acd sequence encoded by the cDNA clone contained in the host identified as 
ATCC Deposit No. 97728 and as shown in SEQ ID NO:2. By the mature CESP 
protem havmg the amino acid sequence encoded by the cDNA clone contained in 
the host identified as ATCC Deposit 97728 is meant the mature form(s) of the 
CESP protein produced by expression in a mammalian cell (e.g., COS cells as 
described below) of the complete open reading frame encoded by the human DNA 
sequence of the clone contained in the vector in the deposited host. As indicated 
below, the mature CESP protein having the amino acid sequence encoded by the 
cDNA clone contained in ATCC Deposit No. 97728 may or may not differ from 
the predicted "mature" CESP protein shown in SEQ ID N0.2 (amino acids from 
about 1 to about 329 in SEQ ID NO:2), depending on the accuracy of the 
predicted cleavage site based on computer analysis. 

Methods for predicting whether a protein has a secretory leader as well as 
the cleavage point for that leader sequence are available. For instance the 
methods of McGeoch (Virus Be, 5:271-286 (1985)) and von Heinje (Nucleic 
AcidsRes. 7*4683-4690 (1986)) can be used. The accuracy of predicting the 
cleavage points of known mammalian secretory proteins for each of these methods 
.s in the range of 75-80'/,. von Heinje, supra. However, the two methods do not 
always produce the same predicted cleavage point(s) for a given protein 

In the present case, the predicted amino acid sequence of the complete 
CESP polypeptides of the present invention were analyzed by a computer program 
( PSORT") (K. Nakai and M. Kanehisa, Genomics 7*897-91 1 (1992)) which is 
an expert system for predicting the cellular location of a protein based on the 
amino acd sequence. As part of this computational prediction of localization the 
methods of McGeoch and von Heinje are incorporated. The analysis by' the 
PSORT program predicted the cleavage sites between amino acids -1 and 1 in 



WO 98/27932 

PCT/US97/23518 

-7- 



SEQ ID NO:2. Thereafter, the complete amino acid sequences were further 
analyzed by visual inspection, applying a simple form of the (-1,-3) rule of von 
Heinje. von Heinje, supra. Thus, the leader sequence for the CESP protein is 
predicted to consist of amino acid residues -21 to -1 in SEQ ID NO:2, while the 
predicted mature CESP protein consists of residues about 1 to about 329 in SEO 
ID NO:2. 

As one of ordinary skill would appreciate, due to the possibilities of 
sequencing errors discussed above, as well as the variability of cleavage sites for 
leaders in different known proteins, the full-length CESP polypeptide comprises 
about 350 amino acids, but may be anywhere in the range of 335 to 365 amino 
acids; and the predicted leader sequence of this protein is about 21 amino acids, 
but may be anywhere in the range of about 14 to about 50 amino acids. 

As indicated, nucleic acid molecules of the present invention may be in the 
form of RNA, such as mRNA, or in the form of DNA, including, for instance 
cDNA and genomic DNA obtained by cloning or produced synthetically. The 
DNA may be double-stranded or single-stranded. Single-stranded DNA or RNA 
may be the coding strand, also known as the sense strand, or it may be the 
non-coding strand, also referred to as the anti-sense strand. 

By "isolated" nucleic acid molecule(s) is intended a nucleic acid molecule 
DNA or RNA, which has been removed from its native environment For 
example, recombinant DNA molecules contained in a vector are considered 
isolated for the purposes of the present invention. Further examples of isolated 
DNA molecules include recombinant DNA molecules maintained in heterologous 
host cells or purified (partially or substantially) DNA molecules in solution 
Isolated RNA molecules include in vivo or /„ vitro RNA transcripts of the DNA 
molecules of the present invention. Isolated nucleic acid molecules according to 
the present invention further include such molecules produced synthetically. 

Isolated nucleic acid molecules of the present invention include DNA 
molecules comprising an open reading frame (ORF) shown in SEQ ID NO 1 
DNA molecules comprising the coding sequence for the mature CESP protein' 
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sho™ ™ SEQ m N0:2 (te 329 ^„ 0 Kids); Md Dm 

.o .he degeneracy of ,he genetic code, ati„ encode the CESP protein Of 
™. genetic code is wci, known in the m . Thus , „ fce ^ 
one aMed m lhe m to generate ^ dega|eme ^ 

In aoother .spec,, the i„v^„„ provides iso|ated 
DNA*„eoon^i„ t h epWod ^ a!AICCDepos . [No ' 

ZTZ 1996 to -*•— * ,his — ' - — * - 

encode , he n^tre polypeplide „ fc ^ 
-™. n-ethton^ne. The invention ^ provides a „ ^ ^ 
n^ecnie havntg the nucieotide science shoml „ ffi 
eetd setptence of ^ CKp ^ ^ fc fc ^ 

c^e. Or . .Co ac* m „ IraiI , having a stance c„„, p ,a mentaiy to on. of £ 
• ov se,^ Such ,, olattd „ panicuiariy ^ 

use^Us psohee for gene mappfa ^ in , & ^ rifadM ^ 

expression ofdteCESPgeneinn^,^ , fo rinstance ^ 
Northern blot analysis. y 

The preaen, invention is further directed to fttg™,,,, of „,„ is0 , aKd 
- ~ By a frag™, of an iMhttd nucleic ~ 

^^"SBQmNO.iaintended^^^,^ 

n.orep ref e ra h, y a,,ea a ahoo, 2 0 M . M i, lm orepreferah, y a,,eas,ahou,30n,a„d 
even toore prefe^, « .east ahon, 40 „, in ^ „ Wch usea , „ ^ 

Ofcon.e.Urgerfragn.en^O ^ 0 

«.,950, ,000, M ,. 11M „, MBto-iro 22 ' • 
invention UI according to the present 

«*— of the deposited cDNA or as shown in SEQ ID NO,. By , fitment 
a. .e« 30 . h ^ for «^ fc ^ 



WO 98/27932 



PCT/US97/23518 



■9- 



more contiguous bases from the nucleotide sequence of the deposited cDNA or 
the nucleotide sequence as shown in SEQ ID NO: 1. 

Preferred nucleic acid fragments of the present invention also include 
nucle.c acid molecules encoding epitope-bearing portions of the CESP protein 
In particular, such nucleic acid fragments of the present invention include nucleic 
acd molecules encoding: a polypeptide comprising amino acid residues from 
about amino acid about-1 to about 65 in SEQ ID NO:2; a polypeptide comprising 
am™ acid residues from about 71 to about 105 in SEQ ID N0.2; a polypeptide 
compnsmg amino acid residues from about 1 14 to about 136 in SEQ ID N0 2 
a polypeptide comprising amino acid residues from about 148 to about 169 in' 
SEQ ID NO: 2; a polypeptide comprising amino acid residues from about 174 to 
about 198 in SEQ ID NO:2; a polypeptide comprising amino acid residues from 
about 213 to about 229 in SEQ ID NO:2; a polypeptide comprising amino acid 
resumes from about 234 to about 253 in SEQ ID NO:2; and a polypeptide 
compnsmg amino acid residues from and about 267 to about 315 in SEQ ID 
NO:2. 



In addition, the present inventors have identified the following cDNA 
clones related to extensive portions of the coding region of SEQ ID NO 1 
HHFBI55Ra (SEQ ID NO: 12); HHFDB95R (SEQ ID NO. ,3); HUSFC71R (SEQ 
n> NO.14); and HCE2S01R (SEQ ID N0.15). The present inventors have 
.dentrfed the following cDNA clone related to an extensive portion of the non- 
codmg region of SEQ ID NO.l: HCEB157R (SEQ ID NO: Id) 

The following public ESTs, which relate to portions of the coding region 
of SEQ ID NO.l have also been identified: GenBank Accession No W61032 
(SEQ ID N0.17); GenBank Accession No. AA349552 (SEQ ID N018) 
GenBank Accession No. R5231 1 (SEQ ID NO.19); GenBank Accession No' 
AA351624 (SEQ ID NO:20); GenBank Accession No. C05172 (SEQ ID N0 21) 
GenBank Accession No. T33818 (SEQ ID NO:22); GenBank Accession No 
AA324686 (SEQ ID NO:23); GenBank Accession No. Z42237 (SEQ ID N024Y 
GenBank Accession No. T30923 (SEQ ID NO:25); GenBank Accession No 
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AA226979 (SEQ ID NO:26); GenBank Accession No. W45085 (SEQ ID 
NO:27); GenBank Accession No. T3 1076 (SEQ ID NO:28); GenBank Accession 
No. T08793 (SEQ ID NO:29); GenBank Accession No. R14945 (SEQ ID 
NO:30); GenBank Accession No. AA031480 (SEQ ID NO:31); GenBank 
Accession No. AA424460 (SEQ ID NO:32); GenBank Accession No. C05296 
(SEQ ID NO:33); GenBank Accession No. R58671 (SEQ ID NO:34); GenBank 
Accession No. T18925 (SEQ ID NO:35);and GenBank Accession No. R57834 
(SEQ ID NO:36). 

In another aspect, the invention provides an isolated nucleic acid molecule 
comprising a polynucleotide which hybridizes under stringent hybridization 
conditions to a portion of the polynucleotide in a nucleic acid molecule of the 
invention described above, for instance, the cDNA clone contained in ATCC 
Deposit 97728. By "stringent hybridization conditions" is intended overnight 
incubation at 42°C in a solution comprising: 50% formamide, 5x SSC (150 mM 
NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x 
Denhardfs solution, 10% dextran sulfate, and 20 g/ml denatured, sheared salmon 
sperm DNA followed by washing the filters in 0. lx SSC at about 65 °C. 

By a polynucleotide which hybridizes to a "portion" of a polynucleotide 
is intended a polynucleotide (either DNA or RNA) hybridizing to at least about 
15 nucleotides (nt), and more preferably at least about 20 nt, still more preferably 
at least about 30 nt, and even more preferably about 30, 40, 50, 60, or 70 nt of the 
reference polynucleotide. These are useful as diagnostic probes and primers as 
discussed above and in more detail below. 

By a portion of a polynucleotide of "at least 20 nt in length," for example, 
is intended 20 or more contiguous nucleotides from the nucleotide sequence of the 
reference polynucleotide (e.g., the deposited cDNA or the nucleotide sequence as 
shown in SEQ ID NO:l). Of course, a polynucleotide which hybridizes only to 
a poly A sequence (such as the 3' terminal poly(A) tract of the CESP cDNA 
shown in SEQ ID NO: 1), or to a complementary stretch of T (or U) resides, 
would not be included in a polynucleotide of the invention used to hybridize to a 
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SL rf : it r of the invention ' since such • po, ~ e — 

hybridize to any „ucle,c acid molecule containing a poly (A) stretch or .h 

complement thereof (e.g practicallvanvH k, the 
S " pract,cal, y an y double-stranded cDNA clone) 
As indicated, nucleic acid molecules nf th» ~ 
a CFSP —i .j molecules of the present invention which encode 

a CESP polypeptide may include, but are not limited to those encodin* * 
acid sequence of the mature peptide, by itself; the J^ZT: 

2 1 ammo ac d leader or secret™ * * 
or secretory sequence, such as a pre- or oro nr „ m 

non-coding sequences, including for eamnU i, * .• aait, °nal, 
non-codin B 5' and v ** ,,mited t£> introns a "° 

tha 2 a T S ~ SUC ^^^^non-trans,atedsequences 

F F ue, sucti as the tag provided in a pQE vector rOiattm I- t 
— • othera, n^y of m m ^ ™ «Np Inc.), 

Ptovtdea for pmmaSw of « *' — * 

UIWdlIon or the fusion proten The "Ra »• - 

===== ^,=r.~ 

al. Cell 37- 767 no«A\ a aescnoed by Wilson e/ 

J/ 767 < 1984 )- As discussed below other *,rh « • 
the CF9P fijsIon Proteins include 

the CESP protein fused to Fc at the N- or C-terminus. 

The present invention further relates to variants of the „ , • 
mo.eenl.xt «fA . »«ui«uus ot the nucleic acid 

molecules of the present invention, which encode nnr*;„„ 
of the CESP nmf v • cnencod e Portions, analogs or derivatives 

me CESP protein. Variants may occur naturallv «,.k 

naturally, such as a natural allelic 
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Further embodiments nf *ho • 

— I and n,o re preferaWy „ ,„„ 6 ~ ~ " - 9S ' / - 
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reference sequence except that the polynucleotide sequence may include up to five 
poant mutations per each 100 nucleotides of the reference nucleotide sequence 
encoding the CESP po.ypeptide. In other words, to obtain a polynucleotide 
havmg a nucleotide sequence at least 95% identical to a reference nucleotide 
sequence, up to 5% of the nucleotides in the reference sequence may be deleted 
or subsututed with another nucleotide, or a number of nucleotides up to 5% of the 
total nucleotides in the reference sequence may be inserted into the reference 
sequence. These mutations of the reference sequence may occur at the 5' or 3 ' 
termmal positions of the reference nucleotide sequence or anywhere between 
those termmal positions, interspersed either individually among nucleotides in the 
reference sequence or in one or more contiguous groups within the reference 
sequence. 

As a practical matter, whether any particular nucleic acid molecule is at 
least 95%, 96%, 97%, 98% or 99% identical to, for instance, the nucleotide 
sequence shown in SEQ ID NO: 1 or to the nucleotides sequence of the deposited 
cDNA clone can be determined conventionally using known computer programs 
such as the Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for 
Umx, Genet.cs Computer Group, University Research Park, 575 Science Drive 
Madison, WI 5371,. Bestfit uses the local homology algorithm of Smith and' 
Watennan, Advances in Applied Mathematics 2: 482-489 (1981), to find the best 
segment of homology between two sequences. When using Bestfit or any other 
sequence alignment program to determine whether a particular sequence is for 

the parameters are set, of course, such that the percentage of identity is calculated 
over the full length of the reference nucleotide sequence and that gaps in 
homo,ogy of up to 5% of the total number of nucleotides in the reference 
sequence are allowed. 

The present application is directed to nucleic acid molecules at least 95% 
960/0, 97o/ 0 , 98S/0 or 99% identical to the nucleic acid sequence shown in SEQ ID 
NO:l or to the nucleic acid sequence of the deposited cDNA. This is because 
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even where . parted mclck acid ^ ^ ^ > 

».v,„ 8 CESP activity, „« of i0 t „ e a „ wou|d m ^ ^ to ' 
nudaic add f OT ^ . . hybrifation ^ ^ < 

teacrion (PCR) printer. Uses of lhe nudeic acid m0 , ecu , es rf fc 
~ ,ha, do „„, encode » p„ lypeplide ^ CESp ^ 

«*» 0) iaoWng the CESP gene or aBelio variants .hereof in a cDNA hW (2) 
in m. hybridizauon <e.g ., "FISH") ,„ aa ^ ^ ^ ^ 

prectse chromosomal looarion of the CESP gene, a S described in Verm, „,„, 
Hunan Cnran.ason.s: A Manua, of Basi< : TecHn,^, Pergamon P ress New 
York (.98S); and (3, Norflten, Bio, an, lysis for de>eci„g CESP mRNA 
expression in specific tissues. 

95./ Jf ™ h ° ,Wer ' ~ ni " le,C "* Se "— - ■« 

SEQ D, NO: , or ,o ,he nudeic acid sequence of «he deposited cDNA which do 

"fact, encode , poiypeptide having CESP activity. By "a poiypeptide having 
CESP ac „ vitv . is ^ ^ ta ^ ^ 

btologtca, assoy. For exatnpie, protein araivjty ^ ^ fc 

7*""** qUa " ,iI1,iVe * — -V «- angiog^asis as deacrihed hy 
Sue,sh.«a/.(y WeCfr c»;o n o„y.^ I92 . I98(1992) This assay udhaes a 
mode! of angiogenesis in a culture system using ^ , 
reconsmtced ,»be„d„the Iial mntrix. ^ ^ „ ^ ^ 
»«ures ere measured morphometries using an hnagc anaiyaar. Briefly this 
assay utvoivesisotating and cntaringc a p itoy e„dofl,e 1 ia l oe« s (fo,exa m p,e flon, 
bovtne adrena, corte* or tmothe, ^ source) . § 

protem ,o ,h. cefl culture, and measuring morphometries,!, the «o, a , length of 
tubular structures -sing phase-contras. microscopic photography 

0f ^'"«'o*edegeneracyof,hegen« t ,ccod.,„„eof„rdinaryski ll 
- he -t wfll immediateiy recogm* ^ a ^ ^ ^ ^ 

ntoiecu ies tavin8 , sequence . ^ ^ ^ ^ ^ ^ ^ ^ ^ 
Ate nudeie acid sequence shown in SEQ ID NO: 1 or ,o the nudeie acid sequent* 
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met, « degenerate variants of tee mtdeodde ^ " 

ede*„bed comparison,^. It will be further recognized it, the « J 
for such macltie add tnolecu.es that „ „„, degenerate variant 
— - * en.de a po^e , avto8 LT^"^ 
- *. s«ed ariisan in My _ of ad „ ^ 

^ OT - * eSee, protein taction (e.g ^ 

altphattea^noaeidwitbaaeeondaiiphaUea^noneidX 



Sectors and Host Cells 



DNA J!' ^ ^ -"" * which include ,b, is o,ated 

"*■*■ ° f ' he "os, ce.1, vebich are genedcall, 

engtneered with Ute recombinant vectors and ,h a 8ene0cal1 '' 
„„. . , vectors, and the production of CESP 

P<*pep,,des or fiagmen,, ^ by ^ 

The polynucleotides may be joined to a vector cuntaming . ^ 

marker for propagation in a host. Generallv j 

y ' a P ,asmid vector s introducerf in s, 
prec, P .tate, such as a calcium phosphate nrecinit^ • 

charged lioid ir* • ^ preClp,tate < or ln » complex with a 

arged hp ld . If the vector „ , ^ . ^ ^ 

appropnatepackagingceinineandthen^sducedintoho^ & 

such as the phage lambda PL promoter, the£ colilac tra^A, 

SV40 eariv »nH t»< ^ d toc P rom <«ers, the 

0 re 2r nM ' mand ~° f ~ L ^-'--fe„. 

-table p^ers vri,, be known ,„ da, s«ed ^ The 
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trans..„o„ !«, „ 1 be l * * C °" S,mC,S * a 

tUg a, He begnuung and , teraiMlio „ 

CHO, COS and Bowes melanoma cells; and plant cells w • 

Ptrc99a, pKK223-3 oKKlv* i ™>c Stratagene; and 

sWIed arusan. W " be W™« to the 

Phosphate transection DEAE dmr „ Smei * a, """° 

«pid-n,edi*ed ^ S^"**""" - d — ■»**<, cation* 

""s. aucn methods are described in manv stanr,,.^ i u 

The polypeptide may be expressed in a modified form t 
Protein, arc, my include „„, secretion^^^^^^ rm, such as a fusion 

^ous^^n, PosLT ; " ^ 

' a regl0n of addztional amino acids, 
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particularly charged amino acids, may be added to * xr 

ss=?=== 

proteins, such as hIL-5 have h~» fi ^ • L example, human 

s^rc^iT poniMK f ° r *• ^ * 

„ , , 8 ^ s, °' d M'fy antagonists of ML-5 See D 

16:9459-9471 (1995) 270 > No 

The CESP protein can be recovered and purified fro m r u- 
cultures by well-known .us ■ recombinant cell 

oy well known methods including ammonium sulfate or «*„ i 
precipitation, acid extraction » ■ ethano1 

-:::^:;: P : rc: r~ 
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("HPLC") is employed for purification. Polypeptides of th, 

products produced bv recomh.n,,,, ♦ u • Procedures, and 

processes. !t of hos t-mediated 

CESP Polypeptides and Fragments 

The invention forther provides an isolated rp«5P . 

inSEQIDNO-2 or an ^ ™ C ° N ^ or the ^no acid sequence 

Polypeptide! ~ *- P ^«^-l--«-«.* W . 

CES p lt T" bC " ^ ^ ^ S ° me — » ■* "quences of the 

P ° lypept,de can b * varied without significant effect of the 
Action oftheprotein. If such differences in seLce ^ °' 

be remembered that there will be JZ " COntemP,ated • * ^ 

activity. 3reaS ° n the PTOtein determine 

Thus, the invention forther includes variations of the CESP no, ■„ 
which show substantial Tpqp . . Polypeptide 
CESP Drot • 1 P ° ,yPePtlde aCtivity ° r whi <* -lude regions of 

r — — - rrrr: err 

am « be f0 „„ d fa 8 « "Wy .o b« 

^"^^ Tolerance to Amino Acid Sub' 60,18 * 
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conned amino acid residue C0,Ben ' ed » »»- 

subaituen, group or flffl „„. . "™° " ' m,,M> K ' d "dudes a 

mpoind, such as a compound to increase the half life „r.i. . 
(for example polyethylene elve.n ^ *" K*W* 

leadeeorsecrelorysenu™,. as an IgG Fc fusion region peptide or 

•eachings herein. ^ ° f *"< * * « from ,he 

* <*SP prore, The " nP " >Ve ° f 

F tn. ,ne prevenuon of aggregation is highly desirable i„ 
ofproteins not only . „. . .. s ™ y oes,r ™a- Aggregation 

u^s/;, and CleJand e/ a/ C/7/ tt. 
/0.-307-377 (1993)). ' *W CWr 

The replacement of amino arirfc i 
•o cell surface receptom ZlT„Z ^''^^ 

mu,«,i„„ s tltin. ' W:2S6 - 268 «*> 3 >. *«*. 

uiauons resulting in selective binding of TWR „ ♦ , 

-» -lude one or more amino acid ***£T r * ^ 
*om nahn* mutationa o, human mamp^ " — 
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Aniino acids in the CESP nrotem nfth- ^ 
mutagenesis or alani^-scanni™ „ . ' ^ M 

«**y. Si,es tta are criricsl for II „ ' '" P"^- 

.SW* 25*306-312 (1992)). } ^ de V ° S 
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Of course, the number of amino acid substitutions a skilled artisan would 
make depends on many factors, including those described above. Generally 
speaking, the number of amino acid substitutions for any given CESP polypeptide 
will not be more than 50, 40, 30, 20, 10, 5, or 3. 
5 Amino acids in the CESP protein of the present invention that are essential 

for function can be identified by methods known in the art, such as site-directed 
mutagenesis or alanine-scanning mutagenesis (Cunningham and Wells, Science 

244: 1081-1085 (1989)). The latter procedure introduces single alanine mutations 

i 

at every residue in the molecule. The resulting mutant molecules are then tested 
10 for biological activity, such as in vitro proliferative activity. 

The polypeptides of the present invention are preferably provided in an 
isolated form. By "isolated polypeptide" is intended a polypeptide removed from 
its native environment. Thus, a polypeptide produced or contained in a 
recombinant host cell is considered "isolated" for the purposes of the present 
15 invention. Also intended as "isolated " is a polypeptide that has been purified, 

partially or substantially, from a recombinant host or from a native source. For 
example, a recombinantly produced version of the CESP polypeptide can be 
substantially purified by the one-step method described in Smith and Johnson, 
Gene 67:3 1-40 (1988). 

20 The polypeptides of the present invention include the complete polypeptide 

encoded by the deposited cDNA; the mature polypeptide encoded by the 
deposited cDNA; amino acid residues -21 to 329 of SEQ ID NO:2; amino acid 
residues -20 to 329 of SEQ ID NO:2; and amino acid residues 1 to 329 in SEQ 
ID NO:2, as well as polypeptides which are at least 95% identical, and more 

25 preferably at least 96%, 97%, 98% or 99% identical to the polypeptide encoded 

by the deposited cDNA, and to the polypeptides of SEQ ID NO:2, and also 
include portions of such polypeptides with at least 30 amino acids and more 
preferably at least 50 amino acids. 

By a polypeptide having an amino acid sequence at least, for example, 

30 95% "identical" to a reference amino acid sequence of a CESP polypeptide is 
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intended that the amino acid sequence of the polypeptide is identical to the 
reference sequence except that the polypeptide sequence may include up to five 
amino acid alterations per each 100 amino acids of the reference amino acid of the 
CESP polypeptide. In other words, to obtain a polypeptide having an amino acid 
sequence at least 95% identical to a reference amino acid sequence, up to 5% of 
the amino acid residues in the reference sequence may be deleted or substituted 
with another amino acid, or a number of amino acids up to 5% of the total amino 
acid residues in the reference sequence may be inserted into the reference 
sequence. These alterations of the reference sequence may occur at the amino or 
carboxy terminal positions of the reference amino acid sequence or anywhere 
between those terminal positions, interspersed either individually among residues 
in the reference sequence or in one or more contiguous groups within the 
reference sequence. 

As a practical matter, whether any particular polypeptide is at least 95%, 
96%, 97%, 98% or 99% identical to, for instance, the amino acid sequence shown 
in SEQ ID NO:2 or to the amino acid sequence encoded by deposited cDNA 
clone can be determined conventionally using known computer programs such the 
Bestfit program (Wisconsin Sequence Analysis Package, Version 8 for Unix, 
Genetics Computer Group, University Research Park, 575 Science Drive, 
Madison, WI 53711. When using Bestfit or any other sequence alignment 
program to determine whether a particular sequence is, for instance, 95% identical 
to a reference sequence according to the present invention, the parameters are set, 
of course, such that the percentage of identity is calculated over the full length of 
the reference amino acid sequence and that gaps in homology of up to 5% of the 
total number of amino acid residues in the reference sequence are allowed. 

The polypeptide of the present invention could be used as a molecular 
weight marker on SDS-PAGE gels or on molecular sieve gel filtration columns 
using methods well known to those of skill in the art. 

In another aspect, the invention provides a peptide or polypeptide 
comprising an epitope-bearing portion of a polypeptide of the invention. The 
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• - tod , ^ as a^::::rt -° r m 

— ■ — - - -rr.-r 

«t is well known in the art th a , r i , m,body can bin <0> 

M > N. and Learner R A AntiW u ' h, " niCk ' 

Protein, can be characterized hv . ♦ , • P ™ y Se<JUence of a 

«7«-778( I9M)at777 S * for »"^W«so»„ <Ce ,, 

"S*d to graerat . cesp , •« P °' Wep,ldes °' Prides that can be 

Borate CESP- Sp ec,fic antibodies include- > „„i 
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IP**** coding amino «*, reiidlles ^ lbou , 71 10 about , „ 
m , NO* a po.ypep.ide coming emino acid residuK ^ ^ , „ ^ 
36 ,„ SEQ ,D N0:2; a p „ Iypepnde ^ ^ ^ ^ 

H8 to ahou, ,69 in SEQ ID NO : 2; a golypeprioe com P riaing ^ ^ resite 

t " b ° U ' ' 74 '° *" " 8 SE< 3 10 NO* a peptide coniprisbg amino 
«d residues from ,„ 0111 21J t0 ^ ^ h ^ ffl ^ ^ 

comprising a^no acid fem 2J4 ^ ^ ^ ^ 

an a polypeptide comprising an™ acid residlles from and ^ ' ' 

315inSEQIDNO:2. 

The epitope-bearing peptides and P „ lyp e ptid e S of the jnventjon 
produced by any ccnvenliona, means (Houghs R. A , ^ ^ ^ 
rapid sobd-phase of ^ ^ rf ^ $ 

antigen-antibody interaction at the level of individual amino acids, Proc NaU 
SU USA S2:5m . m5 (|985)) Th . ^ 

Synthesis (SMPS)- process is taher described in U.S. Paten, No. 4,63 ■ 21 1 .„ 
Houghten e/a/. (1986). 

As one of stall in the art will appreciate, CESP polypeptides of me present 
invention and „ e ephope-be^ng fragment , hereof ^ 

of the constant domain of immunoglobulins ^ ^ 
m ehimenc polyposes. These taion protdra ^ puri6catio „ Md s J 
an increased halflife „ vlw . TWs has bee „ ^ ^ fc 

— * of me ta .wo domains ofthe human OM-polypeptide a„ d various 
domains of «be cnatam regions ofme heavy or hgh, ^ of 
immunog.oouuna (EPA 394,827; Traunecher a, oi, Nam 3S1M . S6 (lm) . 
Fusioa, p^ems ^ have . ^ ^ ^ ^ ^ ^ 

can also be more efncien, in binding a„ d neulrali2i „ g 01her 
monomelic CESP pr „,ei„ „ r proBi „ ^ a , one 
Biochem. 270:3958-3964(1995)). 
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Diagnostic and Prognostic Applications of CESP 



It is believed that certain tissues in mammals with a CESP-related disorder 
express significantly enhanced or diminished levels of the CESP protein and 
mRNA encoding the CESP protein when compared to a corresponding ••standard- 
mammal, i.e., a mammal of the same species not having the disorder. Further it 
is believed that enhanced or diminished levels of the CESP protein can be detected 
m certain body fluids (e.g., blood, sera, plasma, urine, and spinal fluid) from 
mammals with the disorder when compared to sera from mammals of the same 
species not having the disorder. Thus, the invention provides a diagnostic method 
useful dunng diagnosis, which involves assaying the expression level of the gene 
encoding the CESP protein in mammalian cells or body fluid and comparing the 
gene expression level with a standard CESP gene expression level, whereby an 
increase in the gene expression level over the standard is indicative of certain 
disorders. 

CESP related disorders include but are not limited to coronary restenosis 
following coronary revascularization, coronary artery thrombus or occlusion 
myocardial infarction, atria, and/or ventricular arrhythmias, heart block, hereditary' 
med,al -necrosis" of small coronary and pulmonary arteries, focal fibromuscular 
dysplasia of small coronary arteries, cardiomyopathy, arrhythmogenic right 
ventricular dysplasia, and sudden death. 

Where a diagnosis has already been made according to conventional 
methods, the present invention is useful as a prognostic indicator, whereby 
patients exhibiting enhanced or decreased CESP gene expression will experience 
a worse clinical outcome relative to patients expressing the gene at a lower level. 
Further, CESP is detected in amniotic cells. It is believed that CESP can serve as 
a marker for fetal genetic defects. Such fetal genetic defects include 
developmental cardiac defects. 

By "assaying the expression level of the gene encoding the CESP protein- 
is intended qualitatively or quantitatively measuring or estimating the level of the 
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CESP K ,„ , he |eve , of fc ^ ^ cEsp 

b '° °T ^ *"* * or casting absoiute 

level or mRNA level ,„ a second biological sample) 

Preferaoiy, lhe CESP protein tev „ or « , eve , „ fc 

rszr : " ~ ,o ■ — • cesp ~* ^ 

^^-^~^*d^. ^^^^ 

^ oncea standard CESP p ro t ei„ ,ev d or mRNA ,eve! is ^ » can be J 

repeatedly as a standard for comparison. 

^^^^"isin^mtybWogi^^^^^^ 
ce,, iine, , issue culture , or 0Iher ^ ^ « 

or mRNA. BiofcgiC samp.es inciude median body fluids (suc b as b^ 
-ra piaama, urine, synovia, fl uid , spina , ^ and _ 

ammoric cefls) tvbicb co„*„ accreted mature CESP protein, bear, rZ 
glomerulus, and renal tubule. 

The present invention is useful for detecting CESP-related disorders in 
— . Preferred mammals include monlceys, apes, cats, dogs, cows pi gs 

Total cellular RNA can be isolated from a biological sample using the 
-gle-step guanidi^^ method ^J/ 

Chomczynsh and SaccK, Ana, BiocHen,. J62:l56 . 159 (1987) ^ 

Ce,«,03. 312 (I990)) , S1 nucIe : 
mappmg (Fujita e/ a/., Ce/7 ^-357-367 ri987^ ,k , 
, Prin 7 367 (I987)) > the Polymerase chain reaction 

m^T SCription in c ° mbination the po,ymerase ch *« -^n 

■n combmat,on with the ligase chain reaction (RT-LCR). 

Assaying CESP protein levels in a biological samnl. , 
aratsk^ u "'"'ugicaj sample can occur us ne 

antibody-based techniques For examnle tpqp ♦ • 

q i-or example, CESP protein expression in tissues can 
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/D y " (1985); Jalkanen, M eta/ I r»v n , 
(1987)). ' ^ Ce// • Ao/ - ^^3087-3096 

Other antibody-based methods useful for detects CESP 

(ELISA) and the radioimmunoassay (RIA) ^ """"""^ 

- inch.de en^e labeIs , such ^ ^ ^ " *" " 

iodine (™I an „ k ' radl °««>tope S> such as 

u,c <. i, 1), carbon ( l4 Q sutohnr f 3 ^ «. 



CESP Protein Therapy 



It is believed that HSF plavs a mk in - j 
Physiological processes in which pbsd • l H"ences. 

re8U l. ti o„„f bloodpressllreandcardi . ac ' - tM , 

of transudate of plasma ,. . " a,nUre ^ 

-„of h o OT o:, ch ;:::x:~r:r, f ' her ^ or 

vasopressin. ' endothe '»is, renin, and 

*" T p,ays a ro " * a «~* — - <■ * 

neart. Moreover, it i S believed that tpci> 

•^CKrevascolanzali™^ ..■ further, is believed that CESP 

v^lanzattonof^c^fo^^^^ 
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(e.g., coronary artery bypass surgery; percutaneous transluminal coronary 
angioplasty; or administration of an anticoagulant such as heparin, hirudin, 
urokinase, streptokinase, or tissue plasminogen activator) and prevents or inhibits 
restenosis of coronary arteries following revascularization therapy. Accordingly, 
when CESP is administered to a patient receiving revascularization therapy, 
CESP enhances revascularization of cardiac muscle and prevents or inhibits 
restenosis of coronary arteries. 

It is also believed that CESP facilitates angiogenesis (i.e., the formation of 
new vascular tissue). Accordingly, administration of CESP to patients afflicted 
by circulatory illnesses facilitates angiogenesis. Circulatory illness for which 
CESP treatment is beneficial include atherosclerotic heart disease, coronary artery 
constriction, coronary artery blockage (either partial or full), myocardial 
infarction, venous thrombosis, and Reynaud's syndrome. 

The present invention is useful for treating or preventing CESP-related 
disorders in mammals. Preferred mammals include monkeys, apes, cats, dogs, 
cows, pigs, horses, rabbits and humans. Particularly preferred are humans. 

Modes of administration 

It will be appreciated that conditions caused by a decrease in the standard 
or normal level of CESP activity in an individual, can be treated by administration 
of CESP protein. Thus, the invention further provides a method of treating an 
individual in need of an increased level of CESP activity comprising administering 
to such an individual a pharmaceutical composition comprising an effective 
amount of an isolated CESP polypeptide of the invention, particularly a mature 
form of the CESP, effective to increase the CESP activity level in such an 
individual. 

As a general proposition, the total pharmaceutically effective amount of 
CESP polypeptide administered parenterally per dose will be in the range of about 
1 ug/kg/day to 10 mg/kg/day of patient body weight, although, as noted above, 
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n.g* 8 /dayf„ r « hel , ora , on , , f °°> - 

Mg/kg/hour, either by 1-4 iniectinnQ n-. a l 

infi , • , J PCr day or ^ continuous subcutaneous 

SS"*-"— — -*Z 

Pharmaceutical compositions containing the CESP nrotein nf,», 
oucaily, or as an oral or nasal snmv p„ ti . F A 

to modes of administration which include fan. • 

^^^-^rzzzr 

Chromosome Assays 

The nuclei, acid mofecuta of preseM ^ 

us.ng a vane* of „.„ tecWqUK ^ ' 
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WJ one exon m the genomic DMA tk.,c 

- >»» — individual bun™ ~— 
Fluorescence ft, a** hybridization ("FISH") of a cDNA , 
metaphaae chromocomal spread can be „. H , * '° * 

location in one This ,T " ' * Chr0m ° S °"» l 

or do bp. For a revew ofthis tectaique, see Venna « at H 

-d-pd* oucbdZir me ~ be ~ "» 
™ P p«d .ot jr^r , senes ^ *— *« ^ *- 

Next, ,t is necessary to detennine the differences in the cDNA or . 
sequence between affert^ , a ~ UNA or genomic 

oenveen attected and unaffected individuals Tf a m ♦ . 

in some or all of the affected inW v , L tat '° n 18 observed 

and are not tended as " * - ° f 
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Examples 

Example l: Expression and Purification of CESPin E. coli 

T* baaeriaj expression veoor pQEdO is used for blclerial „ 
*. — ph. (Q.AGEN, tec, 9259 E,„„ AvMue> chalswonh ' " 
PQE60 encodes m p WIli n Mlibiotic ^ ^ * J 

( X « codons encoding UslidilK resite ^ 
-« n.c-e^o-.H.nce.c acid („, ^ ^ ^ ; P < ^£ 1 

- -anged auch ,haa an ins^ed DNA ftng^, encoding , 
Jesses ,„a, po lypeptioe ^ the six Ks 8 ' 

The DNA science encoding «he desired port i„„ of the CK 
a*ng «. h^ophohic ,eadec aeon™, is — from ^" 
■one us „g PCR ohgonoCeoaide prfm „ s ^ ^ ^ ^ ™ 
« of ,he desiced portion of fte CE S P pr o,ei„ a„ d „ „ h 
2 ~ 3 ' '° ** C ™ A — — Addidonl nnciel 

tne 5 and 3 sequences, respectively. 

For cloning the mature protein, the 5' nrimpr J,,* »u 

^ pnmer has the sequence S'-CinaA 

153 of the CESP cDNA sequence set out in SEQ ID NO 1 One 
5. P H m „ .egina „» y ^ ^ , 0 

^CIClAfiATrAAATCTCTTCCCCTCCCAOCAOTO. (SEQ m NO 5) 
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containing the underlined Xba I restriction site followed by 24 nucleotides 
complementary* nucleotides 1101 to 1 124 of the CESP DNA sequence set out 
» SEQ ID NO.l, with the coding sequence aligned with the restriction site so as 
to mamtain its reading frame with that of the six His codons in the pQE60 vector 
The amplified CESP DNA fragment and the vector pQE60 are digested 
wtth BamH I and Xba I restriction enzymes and the digested DNAs are then 
"gated together. Insertion of the CESP DNA into the restricted pQE60 vector 
Places the CESP protein coding region downstream from the IPTG-inducib.e 
promoter and in-frame with an initiating AUG and the six histidine codons. 

The ligation mixture is transformed into competent K coll cells using 
standard procedures such as those described in Sambrook et al, Molecular 
Qomng: a laboratory Manual, 2ndEd; Cold Spring Harbor Laboratory Press 
Cold Spring Harbor, NY (1989). K coli strain M5/rep4, containing multiple 
cop.es of the plasmid pREP4, which expresses the lac repressor and confers 
kanamycin resistance ( W), is used in carrying out the illustrative example 
descnbed herein. This strain, which is only one of many that are suitable for 
expressing CESP protein, is available commercially from QIAGEN Inc supra 
Transforms are identified by their ability to grow on LB plates in the presence 
of amp Ic illin and kanamycin. Plasmid DNA is isolated from resistant colonies and 
the .dentity of the cloned DNA confirmed by restriction analysis, PGR and DNA 
sequencing. 

Clones containing the desired constructs are grown overnight ("0/N") in 
liquid culture in LB media supplemented with both ampicillin (100 ugAm) and 
kanamycin (25 ug/ml). The O/N culture is used to inoculate a large culture at a 
dduuon of approximately 1 :25 to 1 :250. The cells are grown to an optical density 
at 600 nm COD600") of between 0.4 and 0.6. Isopropyl-b-D- 
t^ogalactopyranoside ("IPTG") is then added to a final concentration of 1 mM to 
mduce transcription from the lac repressor sensitive promoter, by inactivating the 
lad repressor. Cells subsequently are incubated further for 3 to 4 hours. Cells 
then are harvested by centrifugation. 
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The ^arethensturedfor3^hoursat 4 -Cin6M g uanidine-HCl > pH8 
The ce,. debris is removed by centrifugal and the supernatant containing the 
CESP protein is loaded onto a nickel-nitrilo-tri-acetic acid ("NiNTA") affinity 
resin column (avai,able from QIAGEN, Inc., supra). Proteins with a 6 x His tag 

step procedure (for details see: The QIAexpressionist, 1995, QIAGEN Inc 
^. Briefly the supernatant is loaded onto the column in 6 M guanidine^HCl' 
PH8, the column is first washed with 10 volumes of 6 M guanidine-HCl pH 8 ' 
then washed with 1 0 volumes of 6 M guanidine-HCl P H6, and finally the CESP 
protein is eluted with 6 M guanidine-HCl, pH5. 

Th ; purified P rot -"thenrenaturedbydialyzingitag 
buffered saline (PBS) or 50 mM Na-acetate, p H 6 buffer p,us 200 mM Nad 
Ait^natively, the protein can be successfully refolded while immobilized on the 
N.-NTA column. The recommended conditions are as follows: renature using a 
hnear 6M-1M urea gradient in 500 mM NaCl, 2 0 o/ o gIyceroI) 20 ^ Tris/HCJ 
PH7.4, containingprotease inhibitors. The renaturation should be performed over 
a penod of 1.5 hours or more. After renaturation the proteins can be eiuted by the 
addition of 250 mM immidazole. Lnmidazole is removed by a final dialyzing step 
against PBS or 50 mM sodium acetate pH6 buffer plus 200 mM NaCl ThI 
purified protein is stored at 4°C or frozen at -80^. 

<- . 

The cDNA sequence encoding the m ^ asp ^ ^ 
deposed done w amplined „ sing pCR oli80nucleol| . de pijmeis M 
10 the 5 and 3 'sequences of the gene: 

^ *' Primer M "* ^ UK1M 5 '"T«:CGC<5S4ICCGCCATCATG 
CA.K ; <K^ < KKK«AC-3-( SEQIDN o :o ,c. n ^ 8theundertiKdBaii]H 

73-92 of Che CESP protein coding sequence set out in SEQ ID NO 1 Inserted 
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mto an expression vector, as described below, the 5' end of the amplified fragment 
encoding CESP provided an efficient signal peptide. An efficient signal for 
m,t,at.on of translation in eukaryotic cells, as described by Kozak M J. Mol. 

Biol. 7**947-950 (1987) was appropriately located in the vector portion of the 

construct. 

The 3 ' primer had the sequence S'GCACAfifilACCCACAGCCTGGTC- 
CAGATCTAAATCTCTTCCCCTCCCAG 3' (SEQ ID N0.7), containing the 
underlined Asp7,8 restriction site followed by 42 nucleotides complementary to 
nucleotides 1 105-1 145 of the CESP cDNA sequence set out in SEQ ID NO l 

The amplified fragment was isolated from a 1% agarose gel using a 
commercially available kit ("Geneclean," BIO 101 Inc., La Jolla Ca) The 
fragment then was digested with BamH I and Asp718 and again was purified on 
a 1% agarose gel. This fragment is designated herein F2. 

The vector pA2 was used to express the CESP protein in the baculovirus 
express.cn system, using standard methods, as described in Summers et al A 
Manual of Methods for Baculovirus Vectors and Insect Cell Culture Procedures 
Texas Agricultural Experimental Station Bulletin No. 1555 (1987) This 
expression vector contains the strong polyhedrin promoter of the Autographa 
cahformca nuclear polyidrosis virus (AcMNPV) followed by convenient 
restnctron sites. The polyadenylation site of the simian virus 40 (»SV40«) is used 
for efficient polyadenylation. For an easy selection of recombinant virus the beta- 
galactosidase gene from R colt is inserted in the same orientation as the 
polyhedrin promoter and is followed by the polyadenylation signal of the 
polyhedrin gene. The polyhedrin sequences are flanked at both sides by viral 
sequences for cell-mediated homologous recombination with wild-type viral DNA 
to generate a viable virus that expresses the cloned polynucleotide. 

The P A2 expression vector contains the strong polyhedrin promoter of the 
Autographa califomica nuclear polyidrosis virus (AcMNPV) followed by 
converient restriction sites. The polyadenylation site of the simian virus 40 
( SV40") fa used for efficient poIyadenyIation For ^ ^ seiectbn Qf 
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recombn™, virus tine be-^i^ gene ft0In £ M „. „ ^ h ^ 
on«,uuon „ the po, yhe d™ promoter and „ fo „ owed „ y fc 
a«na, of .he po^ m ^ polyhe(lljn Me ^ ^ ^ 

by vma, sequences for ce „.medi a ted „ omo]ogous recombiiial . on ^ 
vral DNA .o generate viable virus ma, express ,he doned polynue.eo.ide 

Many olher baculovitos vectors could be used in place of pA2 such as 
PA2-GP (which comatas the AoMNPV gp 67 sign,, peptide), pA c37 3 , pVL941 
and pMW provided, as thos. of M readily win appreciate, that construction 
^mn^^^telvva^ trafBcMngalK| 

are descnbed in Luckow « < Wto /«£v m „ . 3 , ^ ^ 

The pA2 plasmid TO digged arid, dte friction enzymes BamH I and 
ASP718. The DNA rvas .hen isolaed fen, a IN agarose gel using , commercial* 
avadaWe kit (-Geneddm- BIO .01 inc.. La Wla, C). This vector DNA is 
designated herein "V M . 

w*h JZT R *" " ePhOSph ^ ™ »ere ligared togdher 

w»b T4 DNAhg.se. E cohHB.O, ^ were tramfonned ^ ^ ^ - 

spread on cumueplates. Baden, were identified .ha. comained .beplaamid wi,h 
.he human CESP gene by digesting DNA from individual colonies using BamH , 
- ASP718 a™, dren the digK(jon ^ by ^ 

-q-c. of me doned fragmen, was oontaed by DNA sconcing. This 
plasmid is designated herein pBacCESP. 

5 Pg of the plasmid pBaoCESP were co-transfected with 1 0 pg of a 

DNA , Pharminga* San Diego CA , ^ ^ ^ 
Mgne, « «,„ P ro , NaU . Acad , M USA ^ 

BaculoGold™ vhua DNA and 5 pg of the p^mid pBacCESP were mixedl a 
Mm* of a miorodter p,ate contobting 50 p, „f senlm . frae ^ ^ 
OATechnCogieatao, g*W»» MD). Al.env.tos, ,„ „ Lipofeail , plus 9 „ 
M' Gmcos medium „ .dded, mixed «, i„c„b.,ed tor ,5 minutes a, room 
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temperature. Then the transection mixture was added drop-wise to Sf9 insect 
ceUs(ATCCCRL,7,l) S eededina35m m tissue culture plate with 1 m, Grace's 
medmm without serum. The plate was rocked back and forth to mix the newly 
added solution. The plate was then incubated for 5 hours at 27°C. After 5 hours 
the transfection solution was removed from the plate and 1 ml of Grace's insect 
medmm supplemented with 10% fetal calf serum was added. The plate was put 
back into an incubator and cultivation was continued at 27°C for four days. 

After four days, the supernatant was collected and a plaque assay was 
performed, as described by Summers and Smith, cited above. An agarose gel with 
"Blue Gal" (Life Technologies Inc., Gaithersburg) was used to allow easy 
.dentulcation and isolation of gal-expressing clones, which produced blue-stained 
Plaques. (A detailed description of a "plaque assay" of this type can also be found 
m the user's guide for insect cell culture and baculovirology distributed by Life 
Technologies Inc., Gaithersburg, page 9-10). 

Four days after serial dilution, the virus was added to the cells After 
appropriate incubation, blue stained plaques were picked with the tip of an 
Eppendorf pipette. The agar containing the recombinant viruses was then 
resuspended in an Eppendorf tube containing 200 u. of Grace's medium The agar 
was removed by a brief centrifugation and the supernatant containing the 
recombinant baculovirus was used to infect Sf9 cells seeded in 35 mm dishes 
Four days later, the supernatants of these culture dishes were harvested and then 
they were stored at 4°C. A clone containing properly inserted hES SB I II and III 
was identified by DNA analysis including restriction mapping and sequencing. 
This is designated herein as V-CESP. 

Sf9 cells were grown in Grace's medium supplemented with 10% heat- 
inactivated FBS. The cells were infected with the recombinant baculovirus 
V-CESP at a multiplicity of infection ("MOI") of about 2 (about 1 to about 3) 
SlX h ° Ure later * the medium was rei «°ved and was replaced with SF900 II medium 
mmus methionine and cysteine (available from Life Technologies Inc 
Gatthersburg). 42 hours later, 5 uCi of ^-methionine and 5 pCi ^-cysteine 
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(available from Amersham) were added. The cells were further incubated for 16 
hours and then they were harvested by centrifugation, lysed and the labeled 
proteins were visualized by SDS-PAGE and autoradiography. 

Example 3: Cloning and Expression in Mammalian Cells 

A typical mammalian expression vector contains the promoter element, 
which mediates the initiation of transcription of mRNA, the protein coding 
sequence, and signals required for the termination of transcription and 
polyadenylation of the transcript. Additional elements include enhancers, Kozak 
sequences and intervening sequences flanked by donor and acceptor sites for RNA 
splicing. Highly efficient transcription can be achieved with the early and late 
promoters from SV40, the long terminal repeats (LTRS) from retroviruses, e.g., 
RSV, HTLVI, HTVI and the early promoter of the cytomegalovirus (CMV). 
However, cellular elements can also be used (e.g., the human actin promoter). 
Suitable expression vectors for use in practicing the present invention include, for 
example, vectors such as PSVL and PMSG (Pharmacia, Uppsala, Sweden), 
pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146) and pBC12MI (ATCC 
67109). Mammalian host cells that could be used include human HeLa 293, H9 
and Jurkat cells, mouse NIH3T3 and C127 cells, Cos 1, Cos 7 and CV 1, quail 
QC1-3 cells, mouse L cells and Chinese hamster ovary (CHO) cells. 

Alternatively, the gene can be expressed in stable cell lines that contain the 
gene integrated into a chromosome. The co-transfection with a selectable marker 
such as dhfr, gpt, neomycin, or hygromycin allows the identification and isolation 
of the transfected cells. 

The transfected gene can also be amplified to express large amounts of the 
encoded protein. The DHFR (dihydrofolate reductase) marker is useful to 
develop cell lines that carry several hundred or even several thousand copies of the 
gene of interest. Another useful selection marker is the enzyme glutamine 
synthase (GS) (Murphy et al., BiochemJ. 227:277-279 (1991); Bebbington etal. 
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Bio/Technology 70:169-175 (1992)). Using these markers, the mammalian cells 
are grown in selective medium and the cells with the highest resistance are 
selected. These cell lines contain the amplified gene(s) integrated into a 
chromosome. Chinese hamster ovary (CHO) and NSO cells are often used for the 
production of proteins. 

The expression vectors pCl and pC4 contain the strong promoter (LTR) 
of the Rous Sarcoma Virus (Cullen et al, Molecular and Cellular Biology, 438- 
447 (March, 1985)) plus a fragment of the CMV-enhancer (Boshart et al., Cell 
47:521-530 (1985)). Multiple cloning sites, e.g., with the restriction enzyme 
cleavage sites BamH I, Xba I and Asp 718, facilitate the cloning of the gene of 
interest. The vectors contain in addition the 3' intron, the polyadenylation and 
termination signal of the rat preproinsulin gene. 

Example 3(a): Cloning and Expression in COS Cells 

The expression plasmid, pCESP HA, is made by cloning a cDNA encoding 
CESP into the expression vector pcDNAI/Amp or pcDNA3 (which can be 
obtained from Invitrogen, Inc.). 

The expression vector pcDNA3 contains: (1) an E. coli origin of 
replication effective for propagation inE coli and other prokaryotic cells; (2) an 
ampicillin resistance gene for selection of plasmid-containing prokaryotic cells; (3) 
an S V40 origin of replication for propagation in eukaryotic cells; (4) a CMV 
promoter, a polylinker, an SV40 intron; (5) several codons encoding a 
hemagglutinin fragment (i.e., an "HA" tag to facilitate purification) followed by 
a termination codon and polyadenylation signal arranged so that a cDNA can be 
conveniently placed under expression control of the CMV promoter and operably 
linked to the SV40 intron and the polyadenylation signal by means of restriction 
sites in the polylinker. The HA tag corresponds to an epitope derived from the 
influenza hemagglutinin protein described by Wilson etal t Cell 57:767 (1984). 
The fusion of the HA tag to the target protein allows easy detection and recovery 
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of .he recomCinam pro,™ M 

pcDNA3 contains, ta ^ fc ^ ^ ~ 

mKcoli W,M„„- . ""ox vectors for expression of CESP 

The Z?Z1 ^ " - to - 

* 20 nuclides (nucleolMK ?3 . 92 f ^ AU ° «« codon f„hW ed 

'■"•HAtagiau^ „ e3 . ^ cDNA sequence set out in SEQ ID NO: 1 . 
tag is used, the 3 pnmer has the sequence 5'-GTnv~r « ^ . „ . 

TCTAAGCGTAGTCTOGGACGTCGTATG^TAA^l^^ AOA " 
AGCAG-3' (SEQ „ NO:9) , co „ r^Tr 

sequence set out in SEQ ID NO. l NA 

ThePCRan.plifiedDNAfiagn.entandthevectorpcDNAS a „• 
with Xba I restrict,™ P<*>NA3, are digested 

»«. - n h ^ P j~:t m ^r;r esys,ems ' 

culture is olaterf n„ 35 CA 92037 )' ^ the transformed 

c P lated °n ampicillm media olate* ™h„u *u 

•heCESP^dtogft^^, ' ™^ for the p, Kence 0 f 
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For expression of recombinant CESP, COS cells are transfected with an 
expression vector, as described above, using DEAE-DEXTRAN, as described, for 
instance, in Sambrook et aL, Molecular Cloning: a Laboratory Manual, Cold 
Spring Laboratory Press, Cold Spring Harbor, New York (1989). Cells are 
incubated under conditions for expression of CESP by the vector. 

Expression of the CESP-HA fusion protein is detected by radiolabeling 
and immunoprecipitation, using methods described in, for example Harlow et aL, 
Antibodies: A Laboratory Manual. 2nd Ed; Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, New York (1988). To this end, two days after 
transfection, the cells are labeled by incubation in media containing 35 S-cysteine 
for 8 hours. The cells and the media are collected, and the cells are washed and 
lysed with detergent-containing RIPA buffer: 150 mM NaCl, 1% NP-40, 0.1% 
SDS, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by Wilson et aL cited 
above. Proteins are precipitated from the cell lysate and from the culture media 
using an HA-specific monoclonal antibody. The precipitated proteins then are 
analyzed by SDS-PAGE and autoradiography. An expression product of the 
expected size is seen in the cell lysate, which is not seen in negative controls. 

Example 3(b): Cloning and Expression in CHO Cells 

The vector pC4 was used for the expression of the CESP protein. Plasmid 
pC4 is a derivative of the plasmid pSV2-dhfr (ATCC Accession No. 37146). The 
plasmid contains the mouse DHFR gene under control of the SV40 early 
promoter. Chinese hamster ovary- or other cells lacking dihydrofolate activity 
that are transfected with these plasmids can be selected by growing the cells in a 
selective medium (alpha minus MEM, Life Technologies) supplemented with the 
chemotherapeutic agent methotrexate. The amplification of the DHFR genes in 
cells resistant to methotrexate (MTX) has been well documented (see, e.g., Alt, 
F. W., Kellems, R. M., Bertino, J. R, and Schimke, R. T., 1978, J Biol. Chem. 
253:1357-1370, Hamlin, J. L. and Ma, C. 1990, Biochem. et Biophys. Acta, 
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7097:107-143, Page, M. J. and Sydenham, M.A. 1991, Biotechnology 9:64-68). 
Cells grown in increasing concentrations of MTX develop resistance to the drug 
by overproducing the target enzyme, DHFR, as a result of amplification of the 
DHFR gene. If a second gene is linked to the DHFR gene, it is usually co- 
amplified and over-expressed. It is known in the art that this approach may be 
used to develop cell lines carrying more than 1,000 copies of the amplified 
gene(s). Subsequently, when the methotrexate is withdrawn, cell lines are 
obtained which contain the amplified gene integrated into one or more 
chromosome(s) of the host cell. 

Plasmid P C4 contains for expressing the gene of interest the strong 
promoter of the long terminal repeat (LTR) of the Rous Sarcoma Virus (Cullen, 
et al, Molecular and Cellular Biology, March 1985:438-447) plus a fragment 
isolated from the enhancer of the immediate early gene of human cytomegalovirus 
(CMV) (Boshart etal, Cell 47:521-530 (1985)). Downstream of the promoter 
are BamH I, Xba I, and Asp 718 restriction enzyme cleavage sites that allow 
integration of the genes. Behind these cloning sites the plasmid contains the 3' 
intron and polyadenylation site of the rat preproinsulin gene. Other high efficiency 
promoters can also be used for the expression, e.g., the human P-actin promoter, 
the SV40 early or late promoters or the long terminal repeats from other 
retroviruses, e.g., HIV and HTLVI. Clontech's Tet-Off and Tet-On gene 
expression systems and similar systems can be used to express the CESP protein 
in a regulated way in mammalian cells (Gossen, M., & Bujard, H. 1992, Proc. 
Natl. Acad. Sci. USA 89: 5547-5551). For the polyadenylation of the mRNA 
other signals, e.g., from the human growth hormone or globin genes can be used 
as well. Stable cell lines carrying a gene of interest integrated into the 
chromosomes can also be selected upon co-transfection with a selectable marker 
such as gpt, G418 or hygromycin. It is advantageous to use more than one 
selectable marker in the beginning, e.g., G418 plus methotrexate. 
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The plasmid pC4 was digested with the restriction enzyme BamH I and 
thendephosphorylated using calf intestinal phosphatase by procedures known in 
the art. The vector was then isolated from a 1% agarose gel. 

The DNA sequence encoding the complete CESP protein, including its 
leader sequence, was amplified using PCR oligonucleotide primers corresponding 
to the 5' and 3' sequences of the gene. The 5' primer had the sequence S'-GC 
TGCCGCG£MIC£GCCACCATGCAGCGGCTTGGGGCCACC 3' (SEQ ID 
NO: 10 containing the underlined BamH I restriction enzyme site, an efficient 
signal for initiation of translation in eukaryotes, as described by Kozak, M, J. 
MoLBiol 795:947-950(1987), and followed by 2 1 nucleotides corresponding to 
nucleotides 73-93 of the CESP protein coding sequence set out in SEQ ID NO: 1 
The 3' primer had the sequence 5'-CACACGCGiMX€£AGATCTAAA 
TCTCTTCCCCTC-3' (SEQ ID NO.ll) containing the underlined BamH I 
restriction site followed by 24 nucleotides (nucleotides 1109-1132) 
complementary to the CESP protein coding sequence set out in SEQ ID NO:l, 
including the stop codon. 

The amplified fragment was digested with the restriction enzyme BamH 
I and then purified again on a 1% agarose gel. The isolated fragment and the 
dephosphoiylated vector were then ligated with T4 DNA ligase. K coli HB101 
or XL-1 Blue cells were then transformed and bacteria were identified that contain 
the fragment inserted into plasmid pC4 using restriction enzyme analysis. 

Chinese hamster ovary cells lacking an active DHFR gene were used for 
transfection. 5 pg of the expression plasmid pC4 were cotransfected with 0.5 pg 
of the plasmid pSV2-neo using lipofectin (Feigner et aL, supra). The plasmid 
pSV2-neo contains a dominant selectable marker, the neo gene from Tn5 
encoding an enzyme that confers resistance to a group of antibiotics including 
G418. The cells were seeded in alpha minus MEM supplemented with 1 mg/ml 
G41 8. After 2 days, the cells were trypsinized and seeded in hybridoma cloning 
plates (Greiner, Germany) in alpha minus MEM supplemented with 10, 25, or 50 
ng/ml of methotrexate plus 1 mg/ml G418. After about 10-14 days single clones 
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were .ryp^ed and then see ded „ ^ petrf dishes „ „ ^ 

of m^hotrexate (50 ^ ,„ ^ 200 ^ 8m 

nM). Clones growing a, *. highest conations of methotrexate were then 
^sftrred ,„ „.„ 6 . well pIaKS MMaining ^ ^ 

methotrexate (, p H 2 p H 5 p H .0 mM. 20 nM). The same procure was 
repeated nnti, done, were obtauted „ Mch 8row . . rf ^ 

,M. Expression of ,he desired gene product was ^ by SDS-PAGE and 
Western blot or by reverse phase HPLC analysis. 

Northern blo, analysis waa carted out ,„ examine CESP gene expression 
■n hums, ussues, using , he mahods fcy ^ ^ ^ ^ 

fQ^ttO^hbetedwifl.^nstagtta^H^™^ 

Anre.bam Life Science), according to mamtfactu^ taslnlctions ^ 
abellmg, , he probe was purifa) ^ a chr()ma ^ 

Tbe punned labelled probe « ,be„ used ,„ exantine various h„ m ,„ 
tissues for CESP mRNA. 

^^'Nortl^^bto^^^^ 
tag. hver shCeta, muscl, kidney, pancreas, spleen, thymus, proslate , 

smag n,^, co, 0 „ Md Mood were • 

Clomech and were examined with Uheged probe using ExpressHy|) „, 

exp„ " „? hybridi2Mion "* ,he b, ° ,s — — - - 

•*~d to flm a. -arc ovemigm, and fflms developed according t „ « 
Reduces. AnabundantZdldJobaset^cript was deleted in beat, and brain 

smooth muscle, HSCI72 cells and osteoblastoma cells. 
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It will be clear that the invention may be practiced otherwise than as 
particularly described in the foregoing description and examples. 

Numerous modifications and variations of the present invention are 
possible in light of the above teachings and, therefore, are within the scope of the 
appended claims. 

The entire disclosure of all publications (including patents, patent 
applications, journal articles, laboratory manuals, books, or other documents) 
cited herein are hereby incorporated by reference. 
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(ii) TITLE OF INVENTION: CEREBELLUM AND EMBRYO SPECIFIC PROTEIN 
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(A) ADDRESSEE: STERNE, KESSLER, GOLDSTEIN & FOX, P.L.L.C. 
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(F) ZIP: 20005-3934 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: To be assigned 

(B) FILING DATE: Herewith 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/033,870 
<B) FILING DATE: 20-DEC-1996 
(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: STEFFE, ERIC K. 

(B) REGISTRATION NUMBER: 36,688 

(C) REFERENCE /DOCKET NUMBER: 1488.061PC01 

(ix) TELECOMMUNICATION INFORMATION: 
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(2) INFORMATION FOR SEQ ID NO:l: 
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(A) NAME /KEY ; CDS 

(B) LOCATION: 73. .1122 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 73. .133 

(ix) FEATURE : 

(A) NAME /KEY : mat peptide 

(B) LOCATION: 1367.1122 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
AGAGGATCCG GGTTCGGTTG CTCTGGCGAG GGCTCCAGCA TCACAGGCGG CGGCTGCGGG 
CGCAGAGCGG AG ATG CAG CGG CTT GGG GCC ACC CTG CTG TGC CTG CTG 

M ^ G i" Arg Leu Gly Ala Thr Leu Le " Cys Leu Leu 
-21 -20 _ 15 

CTG GCG GCG GCG GTC CCC ACG GCC CCC GCG CCC GCT CCG ACG err arr 
Ala Ala Ala Val Pro Thr Ala Pro Ala Pro III Pro Thr Ala Thr 
" 5 1 5 

I~ IT p CA S T ? ?* G CCC GGC CCG GCT CTC A GC TAG CCG CAG GAG GAG 
Ala Pro Val Lys Pro Gly Pro Ala Leu Ser Tyr Pro Gin Glu Glu 

1U 15 20 

GCC ACC CTC AAT GAG ATG TTC CGC GAG GTT GAG GAA CTG ATG GAG GAC 
Ala Thr Leu Asn Glu Met Phe Arg Glu Val Glu Glu Leu Met Glu Asp 
" 30 35 

-hr Gin H? C ?** T G CGC AGC GCG GTG GAA GAG ATG GAG GCA GAA GAA 300 
-hr Gin His Lys Leu Arg Ser Ala Val Glu Glu Met Glu Ala Glu Glu 

45 50 55 

l^l GCT ? CT GCA TCA TCA GAA GTG ^ CTG GCA AAC TTA CCT CCC 34 R 

Ax. Ala Ala Lys Ala Ser Ser Glu Val Asn Leu Ala Asn Leu Pro Pro 
60 65 70 

AGC TAT CAC AAT GAG ACC AAC ACA GAC ACG AAG GTT GGA AAT AAT arc „ oc 
i6r Tyr HiS A ? n Glu Th - Asn ^r Asp Thr Lys Val Gly J£ 12 J£ 396 
75 80 8 5 

ATC CAT GTG CAC CGA GAA ATT CAC AAG ATA ACC AAC AAC CAG ACT GGA 
xle His Val His Arg Glu lie His Lys lie Thr Asn Asn Gin Thr Gly 
90 95 100 



60 
108 

156 

204 

252 



444 



492 



CAA ATG GTC TTT TCA GAG ACA GTT ATC ACA TCT GTG GGA GAC GAA GAA 
Gin Met Val Phe Ser Glu Thr Val lie Thr Ser Val Gly Asp Glu Si 
■ LfJD 110 H5 

r?°. AGA AGG AGC CAC GAG TGC ATC ATC GAC G AG GAC TGT GGG CCC AGC 540 
Gly Arg Arg Ser His Glu Cys lie lie Asp Glu Asp Cys Gly Pro Ser 

130 235 

ATG TAC TGC CAG TTT GCC AGC TTC CAG TAC ACC TGC CAG CCA TGr err 
Met Tyr Cys Gin Phe Ala Ser Phe Gin Tyr Thr Jys SJ Pro CyJ Arg 
140 145 150 

GGC CAG AGG ATG CTC TGC ACC CGG GAC AGT GAG TGC TGT GGA GAC CAG fi^fi 
«ly Gin Arg Met Leu Cys Thr Arg Asp Ser Glu Cys Cys Gly tip Sn * 
155 160 
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= 5 ffi 5 5 ffi 55 S S E S S S 5 S5 a 

180 

5 s s a 5 s a s; s 5 g s- - - - 
s s a st 5 a a c 5 a g s g - « - 

= s s 2 g s s - - s 5 s s 5 S c ::: 

230 

s 3 a a s s 5 s- g 2 5 a ss g 5 s 

- 25 2S 25 SS S! S 2S £ K r ™ - » ™ « 

250 ° Ser Hi s Ser Leu Val Tyr Val Cys 

33 260 

« g s b s c g 5 5 a s g a s s a 
s « a - g ^ g a - 5 S s s s s 

295 

s s a s s s as s s s « ™ « - «. « 

300 F iU Jjg Ser Leu Thr Glu Glu Met 

S S Jg a Pro S Ala S S S Ala ? G ™ ~ « 
315 Hia ~£ Ala Ala Leu Leu Gly Gly Glu 

-J JJT TAGATCTGGA CCAGGCTGTG GGTAGATGTG CAATAGAAAT AGCTAATTTA 

"TTCCCCAGG TGTGTGCTTT AGGCGTGGGC TGACCAGGCT TCTTCCTACA 1CT.GTTCCC 
AGTAAGTTTC CCCTCTGGCT TGACAGCATG AGGTGTTGTG CATTTGTTCA GCTCCCCCAG 
=CTG„ CI GC AGGCTTCACA GTCTGGTGCT TGGGAGAGTC AGGCAGGGTT AAACTGCAGG 
AGGAG, TO G CACCCCTGTC CAGA™ T G GG TO G^ACGAG, T « 
K,,C TACATGGCTT TGATAATTGT TTGAGGGGAG GAGATGGAAA CAATGTGGAG 
-CTCCCTCTG ATTGGTTTTG GGGAAATGTG GAGAAGAGTG CCCTGCTTTG CAAACATCAA 
CCTGGCAAAA ATGCAACAAA TGAATTTTCC ACGCAGTTCT TTCCATGGGC ATAGGTAAGC 
i GTGCCTTCA GCTGTTGCAG ATGAAATGTT CTGTTCACCC TGCATTACAT GTGTTTATTC 
ATCCAGCAGT GTTGCTCAGC TCCTACCTCT GTGCCAGGGC AGCATTTTCA TATCCAAGAT 
CAA^CCCTC TCTCAGCACA GCCTGGGGAG GGGGTCATTG TTCTCCTCGT CCATCAGGGA 



684 
732 
780 
828 
876 
924 
972 



1020 

1068 

1116 

1172 

1232 

1292 

1352 

1412 

1472 

1532 

1592 

1652 

1712 

1772 
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TCTCAGAGGC TCAGAGACTG CAAGCTGCTT GCCCAAGTCA CACAGCTAGT GAAGACCAGA 
GCAGTTTCAT CTGGTTGTGA CTCTAAGCTC AGTGCTCTCT CCACTACCCC ACACCAGCCT 
TGGTGCCACC AAAAGTGCTC CCCAAAAGGA AGGAGAATGG GATTTTTCTT TTGAGGCATG 
CACATCTGGA ATTAAGGTCA AACTAATTCT CACATCCCTC TAAAAGTAAA CTACTGTTAG 
GAACAGCAGT GTTCTCACAG TGTGGGGCAG CCGTCCTTCT AATGAAGACA ATGATATTGA 
CACTGTCCCT CTTTGGCAGT TGCATTAGTA ACTTTGAAAG GTATATGACT GAGCGTAGCA 
TACAGGTTAA CCTGCAGAAA CAGTACTTAG GTAATTGTAG GGCGAGGATT ATAAATGAAA 
TTTGCAAAAT CACTTAGCAG CAACTGAAGA CAATTATCAA CCACGTGGAG AAAATCAAAC 
CGAGCAGTGC TGTGTGAAAC ATGGTTGTAA TATGCGACTG CGAACACTGA ACTCTACGCC 
ACTCCACAAA TGATGTTTTC AGGTGTCATG GACTGTTGCC ACCATGTATT CATCCAGAGT 
TCTTAAAGTT TAAAGTTGCA CATGATTGTA TAAGCATGCT TTCTTTGAGT TTTAAATTAT 
GTATAAACAT AAGTTGCATT TAGAAATCAA GCATAAATCA CTTCAACTGC TAAAAAAA 
v'2J INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS' 

(A) LENGTH: 350 amino acids 
(B> TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 2: 

Met Gin Arg Leu Gly Ala Thr Leu Leu Cys Leu Leu Leu Ala Ala Ala 

~ 15 -10 
V.1 Pro Thr Ala Pro Ala Pro Ala Pro Thr Ala Thr Ser Ala Pro Val 

I*s Pro Gly Pro Ala Leu Ser Tyr Pro Gin Glu Glu Ala Thr Leu Asn 
Glu Met Phe Arg Glu Val Glu Glu Leu Met Glu Asp Thr Gin His Lys 
UU *Jf Val G1U 6 J? Met «» Ala Glu Glu Ala Ala Ala Lys 

Ala Ser Ser Glu Val Asn Leu Ala Asn Leu Pro Pro Ser Tyr His Asn 

«u Thr Asn Thr Asp Thr Lys Val Gly Asn Asn Thr lie His Val His 

85 9Q 

Arg Glu He His Lys He Thr Asn Asn Gin Thr Gly Gin Met Val Phe 

100 105 
Ser Glu Thr Val He Thr Ser Val Gly Asp Glu Glu Gly Arg Arg Ser 

115 120 



1832 

1892 

1952 

2012 

2072 

2132 

2192 

2252 

2312 

2372 

2432 

2490 
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His Glu Cys He lie Asp Glu Asp Cys Gly Pro Ser Met Tyr Cys Gin 

130 135 

Phe Ala Ser Phe Gin Tyr Thr Cys Gin Pro Cys Arg Gly Gin Arc, Met 
iHD 150 



155 



Leu Cys Thr Arg Asp Ser Glu Cys Cys Gly Asp Gin Leu Cys Val Trp 

165 170 

Gly His Cys Thr Lys Met Ala Thr Arg Gly Ser Asn Gly Thr He Cys 



185 



Asp Asn Gin Arg Asp Cys Gin Pro Gly Leu Cys Cys Ala Phe Gin Arg 
i3U 195 200 

Gly Leu Leu Phe Pro Val Cys Thr Pro Leu Pro Val Glu Gly Glu Leu 

210 215 

Cys His Asp Pro Ala Ser Arg Leu Leu Asp Leu lie Thr Trp Glu Leu 

225 230 235 

3iu Pro Asp Gly Ala Leu Asp Arg Cys Pro Cys Ala Ser Gly Leu Leu 

245 250 

Cys Gin P.ro His Ser His Ser Leu Val Tyr Val Cys Lys Pro Thr Phe 

260 265 

Val Gly Ser Arg Asp Gin Asp Gly Glu lie Leu Leu Pro Arg Glu Val 

275 280 

Pro Asp Glu Tyr Glu Val Gly Ser Phe Met Glu Glu Val Arg Gin Glu 

290 295 

Leu Glu Asp Leu Glu Arg Ser Leu Thr Glu Glu Met Ala Leu Gly Glu 

310 315 

Pro Ala Ala Ala Ala Ala Ala Leu Leu Gly Gly Glu Glu He 
32 ° 325 

(2) INFORMATION FOR SEQ ID NO:3: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Pro Ala Pro Arg Arg Arg Trp Leu Leu Leu Leu Ala Val Leu Ala Ala 

5 10 15 

Leu Cys Cys Ala Ala Ala Gly Ser Gly Gly Arg Arg Arg Ala Ala Ser 
ZU 25 30 

Leu Gly Glu Met Leu Arg Glu Val Glu Ala Leu Met Glu Asp Thr Gin 
J3 40 45 
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«ia Lys Leu Arg Asn Ala Val Gin Glu Met Glu Ala Glu Glu Glu Gly 

°° 60 
Ala Lys Lys Leu Ser Glu Val Asn Phe Glu Asn Leu Pro Pro Thr Tyr 

75 80 
His Asn Glu Ser Asn Thr Glu Thr Arg lie Giy Asn Lys Thr ^ ^ 

Thr His Gin Glu lie Asp Lys Val Thr Asp Asn Arg Thr Gly S^r Thr 

He Phe Ser Glu Thr He He Thr Ser He Lys Gly Gly 6 " Asn Lys 

Arg Asn His Glu Cys He He Asp Glu Asp Cys Glu Thr Gly Lys Tyr 

1JO 140 
Cys Gin Phe Ser Thr Phe Glu Tyr Lys Cys Gin Pro Cys Lys Thr Gin 

155 160 
His Thr His Cys Ser Arg Asp Val Glu Cys Cys Gly Asp Gin Leu Cys 

Val Trp Gly Glu Cys Arg Lys Ala Thr Ser Arg Gly Glu Asn Gly Thr 

105 19Q 

He Cys Glu Asn Gin His Asp Cys Asn Pro Gly Thr Cys Cys Ala Phe 

100 205 
Gin Lys Glu Leu Leu Phe Pro Val Cys Thr Pro Leu Pro Glu Glu Gly 

220 

Glu Pro Cys His Asp Pro Ser Asn Arg Leu Leu Asn Leu He Thr Trp 

Glu Leu Glu Pro Asp Gly Val Leu Glu Arg Cys Pro Cys Ala Ser Gly 

250 255 

LW Ue feo 2 Hl 6 f Ser ^ ™* Ser Val Cys Glu 

^65 270 

Leu Ser Ser Asn Glu Thr Arg Lys Asn Glu Lys Glu Asp Pro Leu Asn 

285 

Met Asp Glu Met Pro Phe He Ser Leu He Pro Arg Asp He Leu Ser 

300 

Asp Tyr Glu Glu Ser Ser Val He Gin Glu Val Arg Lys Glu Leu Glu 

315 320 
Ser Leu Glu Asp Gin Ala Gly Val Lys Ser Glu His Asp Pro Ala His 

JZ5 330 335 

Asp Leu Phe Leu Gly Asp Glu He 
340 

INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: iinear 
fiiJ MOLECULE TYPE : cDNA 



^ SEQUENCE ASCRIPTION: SEQ ID NO-,- 

GGGAGGATCC gcgcccgctc cgacggcg 

(2) INFORMATION FOR SEQ ID NO: 5: 

<i) SEQUENCE CHARACTERISTICS- 
J LENGTH: 34 base^rs 
r I^ E: nucl eic acid 
n STRANDEDNESS: single 
TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



SEQUENCE DESCRIP TI0N: SEQ IDNQ . 5 . 
SCCTCTAGAT TAAATCTCTT CCCCTCCCAG CAGT 
(2) INFORMATION FOR SEQ r D N0:6; 

fi) SEQUENCE CHARACTERISTICS • 
J ^NGTH: 38 base'paL 
r I™ E: nucleic acid 
C STRANDEDNESS: single 
TOPOLOGY: linefr 9 

(ii) MOLECULE TYPE: cDNA 



UL) SEQUENCE DESCRIPTION: SEQ ID NO-6- 
TGCCGCGGAT CCGCCATCAT GCAGCGGCTT GGGGCCAC 
<2) INFORMATION FOR SEQ ID NO:7 : 

<i) SEQUENCE CHARACTERISTICS • 

Si 4 ! S« 

,ri : nucJ -eic acid 

C STRANDEDNESS: sinqle 

TOPOLOGY: linear 9 

(ii) MOLECULE TYPE: cDNA 



<*i> SEQUENCE DESCRIPTION: SEQ ID NO- 7- 
SCACAGGTAC CCACAGCCTG GTCCAGATCT AAATCTCTTC CCCTCCCAG 
(2) INFORMATION FOR SEQ ID N0:8: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GTCTCTAGAC AGATCTAAAT CTCTTCCCCT CCCAG 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 65 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GTCTCTAGAC AGATCTAAGC GTAGTCTGGG ACGTCGTATG GGTAAATCTC TTCCCCTCCC 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 41 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCTGCCGCGG ATCCGCCACC ATGCAGCGGC TTGGGGCCAC C 
12) INFORMATION FOR SEQ ID NO:ll: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



AGCAG 



65 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CACACGCGGA TCCAGATCTA AATCTCTTCC CCTC 
(2) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 384 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
AATTCGGCAC GAGCCTGACT GAAGAGATGG CGCTGAGGGA CCCTNCGGCT GCCGCCGTTG 
CACTGCTGGG AGGGGAAGAG ATTTAGATCT GGACCAGGCT GTGGGTAGAT GTGCAATAGA 
AATAGCTAAT TTATTTCCCC AGGTGTGTGC TTTAGGCGTG GGCTGACCAG GCTTCTTCCT 
ACATCTTCTT CCCAGTAAGT TTCCCCNCTG GCTTGACAGC ATGAGGTGTT GTGCATTTTG 
TTCAGCTCCC CCAGGCTGTT CTCCAGGNTT CACAGTCTGG TGCTTGGGAG AGTNAAGGCA 
GGGTTAAACT TCAGGAGCAG TTTGCCACCC NTNGTNCNGA TTATTTGGCT TGCTTTNCCN 
NTACCATTTG CAAAANAGCC GTTT 
(2) INFORMATION FOR SEQ ID NO: 13: 

U) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 503 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



34 



60 
120 
180 
240 
300 
360 
384 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
AATTCGGCAC GAGGTTCCGC GAGGTTGAGG AACTGATGGA GGACACGCAG CACAAATTGC 
GCACGCGGTG GAAGAGATGG AGGCAGAAGA AGCTGCTGCT AAAGCATCAT CAGAAGTGAA 
CCTGGCAAAC TTACCTCCCA GCTATCACAA TGAGACCAAC ACAGACACGA AGGTTGGAAA 
TAATACCATC CATGTGCACC GAGAAATTCA CAAGATAACC AACAACCAGA CTTGACAAAT 
GGTCTTTTTC AGAGACAGTT TNACATCTTT GGGAGACGAG AAGCAGAGGN GCNCGNTNCN 
TATNGCGNGG CTTTTGGCCA GATTNCTNCC ATTTNCAGTT CCNTAACTNC ACCTNCCGGC 
CANGGTNTTT ACCCGGCATN GTTTTTGGCC ACTTTNTTTG GTATNNCCAA TGCCCNGGAG 
ATNGCCTTTN NACANGGNTC ACCGGTTTTT TNTTCAAGGG TTTTCTTTTA AATCCTGGGG 



60 
120 
180 
240 
300 
360 
420 
480 
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--TTCTTCCC ACGTTTGNTT TCT 
■2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS • 

J LENGTH: 4 90 base p airs 
B TYPE: nucleic acid 
C STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



-54- 



503 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:l 4 
G3CACGAGGG AGGCTTGAGG TGGAAGTGGG GGTCGGGCAC 
GCTAGGGTTT GAACCGGGGA CAGAGTCTAG GTGAGCTGGG 
GAGGATCCGG GTTCGGTTGC TCTGGCGAGG GCTCCAGCAT 
GCANAGCGTA GATGCAGCGG CTTGGGGCCA CCCTGCTGTG 
TCCCCACGGC CCCCGCGCCC GCTCCGACGG CGACCTCGGC 
CTCTGGACTN ACCCGCAGAG GGANGCCAAC CTNCAATGGA 
A33AATNGAT GGGAAGGACA CGCNANNANA AATTGCGCNA 
-CAAGAAAG AAGTTGCTGG TGAAAGNATC ATCAAGAAAT 
— CCANNANT 

INFORMATION FOR SEQ ID NO: 15; 

<i) SEQUENCE CHARACTERISTICS • 
A LENGTH: 84 base pairs 
r III nu <=leic acid 
C STRANDEDNESS: double 
(DJ TOPOLOGY : linear 



TCTGACCTGG 
GCTTGGGAGC 
CACAGGCGGC 
CCTGCTGCTG 
TCCAGTCAAG 
AATGTTTCCG 
GCGGTTGGGA 
GGAACTTGGC 



TCGAGGAGGG 
TATTAGCGTA 
GGCTGCGGGC 
GCGGCGGCGG 
CCCGGCCCGG 
CGNAGTTTGG 
AGAGATGGAA 
AAATTGAACT 



60 
120 
180 
240 
300 
360 
420 
480 
490 



(ii) MOLECULE TYPE: cDNA 



SEQUENCE DESCRIPTION: SEQ ID NO-15- 

GCTCCSCTCS AGCCCGGOT — CTACCCATAG GTGGAGGNCA 
— xGNG.GC ANACCCTTGC CCAA 

INFORMATION FOR SEQ ID NO: 16: 

fiJ SEQUENCE CHARACTERISTICS : 

J LENGTH: 221 base pairs 
b TYPE: nucleic acid 
C STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



60 
84 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CAATGTGGAG TCTCCCTCTG ATTGGTTTTN GGGAAATGTG GAGAAGAGTG CCCTGCTTTC 
CAAACATCAA CCTGGCAAAA ATGCAACAAA TGNATTTTCC ACGCATTCTT TCCATGGGCA 
7AGGTAAGCT GTGCCTTCAG CTGTTGCAGA TGAAATGTTC TGTTCACCCT GCATTACATG 
T'TTTATTCA TCCAGCAGTG TTGCTCAGCT CCTACCTCTG T 
'-2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 557 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:17 : 
CCCACGAGTG CATCATCGAC GAGGACTGTG GGCCCAGCAT GTACTGCCAG TTTGCCAGCT 
TCCAGTACAC CTGCCAGCCA TGCCGGGGCC AGAGGATGCT CTGCACCCGG GACAGTGAGT 
C-CTGTGGAGA CCAGCTGTGT GTCTGGGGTC ACTGCACCAA AATGGCCACC AGGGGCAGCA 
ATGGGACCAT CTGTGACAAC CAGAGGGACT GCCAGCCGGG GCTGTGCTGT GCCTTCCAGA 
GAGGCCTGCT GTTCCCTGTG TGCACACCCC TGCCCGTGGA GGNGAGCTTT GCCATGACCC 
C-CAGCCGG CTTCTGGACC TCATCACCTG GGAGCTAGAG CCTGATGGAG CCTTGGACCG 

sracccrrcT gccagtggcc tcctctgcca gccccacagc cacagcctgg tgtatgtgtg 

CAAGCCGACC TTCGTGGGGA GCCGTGACCA AGATGGGGAG ATCTGCTGCC CAGAGAGGTC 
CCGATGAGTA TGAAGTTGGA ACTTCATGGA GGAGGTNCGC AAGAACTTGA AGACTTGAGA 
C^AGCTTACT GAANAAT 

INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 410 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



60 
120 
180 
221 



60 
120 
180 
240 
300 
360 
420 

480 

540 

557 



ii SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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G3GACCATCT GTGACAACCA GAGGGACTGC CAGCCGGGGC TGTGCTGTGC CTTCCAGAGA 
G3CCTGCTGT TCCCTGTGTG CACACCCCTG CCCGTGGAGG GCGAGCTTTG CCATGACCCC 
GCCAGCCGGC TTCTGGACCT CATCACCTGG GAGCTAGAGC CTGATGGAGC CTTGGACCGA 
"CCCTTGTG CCAGTGGCCT CCTCTGCCAG CCCCACAGCC ACAGCCTGGT GTATGTGTGC 
AAGCCGACCT TCGTGGGGAG CCGTGACCAA GATGGGGAGA TCCTGCTGCC CAGAGAGGTC 
CCCGATGAGT ATGAAGTTGG CAGCTTCATG GAGGAGGTGC GCCAGGAGCT GGAGGACCTG 
G3AGAGGAGC CTTGACTTNA AGAGATGGCG CTGAGGGAGC CTTCGGGTTG 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 397 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 
KAAGTTGGCA GCTTCATGGA GGAGGTGCGC CAGGAGCTGG AGGACCTGGA GAGGAGCCTG 
ACTGAAGAGA TGGCGCTGGG GGAGCCTGCG CTGCCGCCTT GGCANTGCTG GGAGGGGAAG 
^ATTTAGAT CTGGACCAGG CTGTGGGTAG ATGTGCAATA GAAATAGCTA ATTTATTTCC 
C^AGGTNTGT GCTTTAGGCG TGGGCTGACC AGGCTTCTTC CNACATCTTC TTCCCAGTAA 
^..TCCCCTC TGGCTTGACA GCATGAGGTG TTNTGCATTT GTTCAGCTCC CCCAGGCTGT 
rCTCCAGGCT TCACAGTCTT GTGCTTGGGA GAGTCAGGCA GGGTTAAACT GCAGGAGCAG 
--TGCCACCC CTGTCCAGAT TATTTGGCTG CTTTGCC 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 356 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



60 
120 
180 
240 
300 
360 
410 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 20: 
-CATCATCG ACGAGGACTG TGGGCCCAGC ATGTACTGCC AGTTTGCCAG CTTCCAGTAC 
CCTGCCAGC CATGCCGGGG CCAGAGGATG CTCTGCACCC GGGACAGTGA GTGCTGTGGA 
ACCAGCTGT GTGTCTGGGG TCACTGCACC AAAATGGCCA CCAGGGGCAG CAATGGGACC 



60 
120 
180 
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ATCTGTGACA ACCAGAGGGA CTNCCAGCCG GGGCTGTGCT GTGCCTTCCA GAGAGGCCTG 
CTGTTCCCTG TGTGCACANC CCTGCCCGTG GAGGGCGAGC TTTGCCATGA CCCCGNCAGC 
CGGNTTCTGG ACCTCATCAA CTGGGAGCTA GAGCCTGATG GAGCCTTGGA CCGATG 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH: 319 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



240 
300 
356 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



CTGGNGAGGA 


GCCTGACTGA AGAGATGGCG 


CTGGGGGAGC CTGCGGCTGC 


CGCCGCTGCA 


60 


CTGCTGGGAG 


GGGAAGAGAT TTAGATCTGG 


ACCAGGCTGT NGGTAGATGT 


GCAATAGAAA 


120 


TAGCTAATTT 


ATTTCCCCAG GTGTGTGCTT 


TAGGCGTGGG CTGACCAGTC 


TTCTTCCTAC 


180 


ATCTTCTTCC 


CANTAAGTTT CCCCTCTGGC 


TTGACAGCAT GAGGTGTTGT 


GCATTTTTTC 


240 


AGCTCCCCCA 


GGCTGTTCTC CAGGCTTCAC AGTCTGGTGC TTGGGAGAGT 


CAGGCAGGGT 


300 


TAAACTNCAG 


GAGCAGTTT 














319 


(2) INFORMATION FOR SEQ ID NO: 22: 







(l) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 298 base pai 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CTTCATGGAG GAGGTGCGCC AGGAGCTGGA GGACCTGGAG AGGAGCCTGA CTGAAGAGAT 
GGCGCTGGGG GAGCCTGCGG CTGCCGCCGT GNCACTGCTG GGAGGGGAAG AGATTTAGAT 
CTGGACCAGG CTGTGGGTAG ATGTGCAATA GAAATAGCTA ATTTATTTCC CCAGGTGTGT 
GCTTTAGGCG TGGGCTGACC AGGCTTCTTN CTACATCTTC TTCCCAGTAA GTTTCCCCTC 
TGGCTTGACA GCATGAGGTG TTGTGCATTT GTTCAGCTCC CCCAGGCTGT TCTCCAGG 
(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 302 base pairs 

(B) TYPE: nucleic acid 



60 
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180 
240 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-23: 

GTCAAGCCCG GCCCGGCTCT CAGCTACCCG CAGAGGAGGC CACCCTCAAT GAGATGTTCC 

GCGAGGTTGA GGAACTGATG GAGGACACGC AGCACAANTT GCGCANGCGG TTGGAAGAGA 

TGGAGGCAGA AGAAGCTGCT GCTAAAGCAT CATCAGAAGT GAACCTGGCA AACTTACCTC 

CCAGCTATCA CAATGAGACC AACACAGACA CGAAGGTTGG AAATAATACC ATCCATGTGC 

ACCGAGAAAT TCACAAGATA ACCAACAACC AGACTGGACA AATGGTCTTT TCAGAGACAG 
TT 

(2) INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 279 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
GAGGAGCCTG ACTGAAGAGA TGGCGCTGAG GGAGCCTGCG GCTGCCGCCG TNGCACTGCT 
GGGAGGGGAA GAGATTTAGA TCTGGACCAG GCTGTGGGTA GATGTGCAAT AGAAATAGCT 
AATTTATTTC CCCAGGTGTG TGCTTTAGGC GTGGGCTGAN CAGGCTTCTT NCTACATCTT 
CTTGCCAGTA NGNTTCCCCT CTGGCTTGAC AGCATGAGGT GTTGTGCATT TGTTCAGCTC 
CCCCAGGCTG TTCTCCAGGC TTCACAGTCT GGTGCTTGG 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH; 263 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



60 
120 
180 
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60 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TACCATCCAT GTCCACCSAG AAATTCACAA GATAACCAAC AACGAGA CT G GACAAATGGT SO 
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CTTTTCAGAG ACAGTTATCA GA TOOT GGG AGACGAAGAA eaaam 
CATCATCGAC GAGGACTNTG GGCCCAGCAT GTACTGCCAG TTTGCCAGCT TCCAGTACAC 
CTGCCAGCCA TGCCGGGGCC AGAGGATGCT CTNCACCCGG GACAGTGAGT GCTGTGGAGA 
CCAGCTGTGT GTCTGGGGTC ACT 
(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 359 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GGCTGCGGCG CAAGCGANGA TGCAGCGGCT TGGGGCCACC CTGCTGTGCC TGCTGCTGGC 
GGCGGCGGTC CCCACGGCCC CCGCGCCCGC TCCGACGGCG ACCTCCCCTC CAGTCAAGCC 
CGGCCCGGCT CTCAGCTACC GCGCAGGAGG AGGCCACCCT CAATGAGATG TTCCGCGAGG 
TTGAGGAACT GATGGAGGAC ACGCAGCACA AATTGGCACC GGTGGAAGAG ATGGAGGCAG 
AAGAAGCTGC TGCTAAAGCA TCATCAGAAG TGAACCTGGC AAACTTACCT CCCAGCTATC 
ACAATGAGAC CAACACAGAC ACGAAGGTTG GAAATAATAC CATCCATGTG CACCGAGAA 
(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS * 

(A) LENGTH; 325 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE; cDNA 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO;27; 
ACCAGAGGGA CTGCCAGCCG GGGTGTGCTC GTGCCTTCCA GAGAGGCCTG CTGTTCCCTG 
TCTGCACACC CCTGCCCGTG GAGCGGACGC TTTGCATGAC CCCGCCAGCC GGCTTCTGGA 
CCTCATCACC TGGGAGCTAG AGCCTGATGG AGCCTTGGAC CGATGCCCTT GTGCCAGTGG 
CTCCTCTGCC AGCCCCACAG CCACAGCCTG GTGTATGTGT GCAAGCCGAC CTTCGTGGGG 
AGCCGTGACC AAGATGGGGA GATCCTGCTG CCCAANAAAG GTCCCCGATT GAGTATGAAG 
TTGGCAAGCT TCATGGAAGG AANGG 
(2) INFORMATION FOR SEQ ID NO; 28; 
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U) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 238 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
GAGAAATTCA CAAGATAACC AACAACCAGA CTGGACAAAT GGTCTTTTCA GAGACAGTTA 
TCACATCTGT GGGAGACGAA GAAGGCAGAA GGAGCCACGA GTGCATCATC GACGAGGACT 
NTGGGCCCAG CATGTACTGC CAGTTTNCCA GCTTCCAGTA CACCTGCCAG CCATGCCGGG 
GCCAGAGGAT GCTCTGCACC CGGGACAGTG AGTGCTGTGG AGACCAGCTG TGTGTCTG 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 236 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO:29: 
TTGGAAATAA TACCATCCAT GTGCACCGAG AAATTCACAA GATAACCAAC AACCAGACTG 
CACAAATGGT CTTTTCAGAG ACAGTTATCA CATCTGTGGG AGACGAAGAA GGCAGAAGGN 
GCCACGAGTG CATCATCGAC GAGGACTGTG GGCCCAGCAT GTACTGCCAG TTTGCCAGCT 
TCCAGTACAC CTGCCAGCCA TGTNGGGGCC AGAGGATGCT CTGCACCCGG GACAGT 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 344 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 
CCAGCTGTGT GTCTGGGGTC ACTGCACCAA AATGGCCACC AGGGGCAGCA ATGGGACCAT 
CTGTGACAAC CAGAGGGACT GCCAGCCGGG GCTGTGCTGT GCCTTCCAGA GAGGCCTGCT 
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GTTCCCTGTG TGCACACCCC TGCCCGTGGA GGGANGCTTT GCCATGACCC CGCCAGCCGG 
CTTCTGGACC TCATCACCTG GGGAGCTAGA GCCTGATGGA GCCTTGGGAC CGATGCCCTT 
GTGCCAGTGG CCTCCTCTTG CCAGCCCCAC AGCCACAGCC TGGGTGTATG TTGTGCAAAG 
CCGACCTTCG TNGGGGAACC GTGACCAAGA TGGGGGAGAT TCTT 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS • 

(Aj LENGTH: 218 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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Ui) SEQUENCE DESCRIPTION : SEQ ID NO:31: 
TTTTTTGGGG AAATAAATTA GCTATTTCTA TTGCACATCT ACCCACAGCC TGGTCCAGAT 
CTAAATCTCTTCCCCTCCCA GCAGTGCAGC GGCGGCANAG GNCTCCCCCA GCGCCATCTC 
TTCAGTCAGG CTCCTCTCCA GGTCCTCCAG CTCCTGGCGC ACCTCCTCCA TGAAGCTGCC 
AACTTCATAC TCATCGGGGA CCTCTCTGGG CAGCAGGA 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 247 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
CCAGACGGAG ATGCAGCGGC GTTGGGGCCA CCCTGACTGT GCCTGCTGCT GGCGGCGGCG 
GTCCCCACGG CCCCCGCGCC CGCTCCGACG GCGACCTCGG CTCCAGTCAA GCCCGGCCCG 
GCTCTCAGCT ACCCGCAGGA GCGAGGCCAC CCTCAATGAG ATGTTCCGCG AGGTTGAGGA 
ACTGATGGAG GACACGCAGC ACAAATTGCG CAGCGGTGGG AAGAGATGGA GGCAGAAGAA 
GCTGCTG 

'2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 210 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 33: 
CTGGNAGAAG GAGCCTGACT GAAAGAGATG GCGCTGGGGG AGCCTGCGGC TGCCGCCGTG 
KCACTGCTGG GAGGGGAAGA GATTTAGATC TGGACCAGGC TGTGGGTAGA TGTGCAATAG 
AAATAGCTAA TTTATTTCCC CAGGTGTGTG CTTTAGGCGT GGGCTGACCA GGNTTCTTCC 
TACATCTTCT TCCCAGTAAG TTTCCCCTCT 
12) INFORMATION FOR SEQ ID NO: 34: 



(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 303 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO:34: 



CCAGTACTGG 


TACAATATGG ATCTTTTCAG 


AGACAGGTTA 


TCACATCTGT 


NGGAGACGAA 


60 


GAAGGCAGAA 


GGAGCCACGA GTGCATCATC 


GACGAGGACT 


GTGGGCCCGG 


CTCTCAGCTA 


120 


-CCGCAGAGG 


AGGCCACCCT CCTHTAGATG TTCCGCGAGT 


TGAGGACTGA 


TGGAGGACAC 


180 


GCTGCACTGC 


TGGGAGGGGA AGAGATTTAG 


ATCTGGACCA 


GGCTGTGGGT 


AGATGTGCAA 


240 


TAGAAATAGC 


TAATTTATTT CCCAGGTGTG 


TGCTTTAGGC 


GTGGCTGACC 


AGGTTCTTCT 


300 


ACA 










303 



.'2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 174 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

CGCTGCACTG CTGGGAGGGG AAGAGATTTA GATCTGGACC AGGCTGTGGG TAGATGTGCA 60 

ATAGAAATAG CTAATTTATT TCCCCAGGTG TGTGCTTTAG GCGTGGGCTG CCCAGGCTTC 120 

7TCCTACATN TCCGTCCCNG TAAGTTTCCC CTCTAGCGAA AACAGAATAA GGTG 174 
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(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 151 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GACGAAGAAG GCAGAAGGAG CCACGAGTGC ATCATCGACG AGGACTGTGG GCCAAGCATG 60 
TACTGCCAGT TTAACAGCTA ACAGTACCAC CTGCCAGCCA TGCCGGAAAA AGAGGATGAC 120 
TCTGCACCCG GGACAGTGAG TGACTGTAGG A 
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What Is Claimed Is: 



1. An isolated nucleic acid molecule comprising a polynucleotide 
having a nucleotide sequence at least 95% identical to a sequence selected from 
the group consisting of: 

(a) a nucleotide sequence encoding the amino acid sequence at 
position about -21 to about 329 in SEQ ID NO:2; 

0>) a nucleotide sequence encoding the amino acid sequence at 
position about -20 to about 329 in SEQ ID NO:2; 

(c) a nucleotide sequence encoding the amino acid sequence at 
position about 1 to about 329 in SEQ ID NO:2; 

(d) a nucleotide sequence encoding the CESP polypeptide having the 
complete amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 97728; 

(e) a nucleotide sequence encoding the mature CESP polypeptide 
having the amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 97728; and 

(f) a nucleotide sequence complementary to any of the nucleotide 
sequences in (a), (b), (c), (d), or (e). 

2. The nucleic acid molecule of claim 1 wherein said polynucleotide 
has the complete nucleotide sequence in SEQ ID NO:l. 

3. The nucleic acid molecule of claim 1 wherein said polynucleotide 
has the nucleotide sequence in SEQ ID NO:l encoding the CESP polypeptide 
having the complete amino acid sequence in SEQ ID NO:2. 

4. The nucleic acid molecule of claim 1 wherein said polynucleotide 
has the nucleotide sequence in SEQ ID NO:l encoding the mature CESP 
polypeptide having the amino acid sequence in SEQ ID NO:2. 
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5. The nucleic acid molecule of c.ai m 1 wherein said polynucleotide 
has the complete nucleotide sequence of the cDNA clone contained in ATCC 
Deposit No. 97728. 

«. '"'""•^c acid notoie of daim, „ h ereta said polynuclaoHde 
has d. nudeodde sequence encoding to CESP po^epdde having , he compto. 
arrnno acid sequence encode by «he cDNA done coanataed in ATCC Deposi, No. 

y/ /2o. 

7 The nuddc acid molecuie of daim , wherein said poiynudeoride 
has .he nudeorid. sequence encoding ,he mMure C ESP polypeptide having dne 
anuno acrd sequence encoded hy «. cDNA done conrahKd ,. ATCC Deposit No 

8 " . ^ iso,ated acid molecule comprising a polynucleotide 

which hybndizes under stringent hybridization conditions to a polynucleotide 
havmg a nucleotide sequence identical to a nucleotide sequence in (a) (b) (c) (d) 
or (e) of claim 1 wherein said polynucleotide which hybridizes does not hybridize 

under stringent hybridization conditions to a polynucleotide having a nucleotide 

sequence consisting of only A residues or of only T residues. 

9. An isolated nucleic acid molecule comprising a polynucleotide 
whch encodes the amino acid sequence of an epitope-bearing portion of a CESP 
poIypep.de having an amino acid sequence in (a), (b), ( c ), (d), or (e) of claim 1 . 

10. The isolated nucleic acid molecule of claim 9, which encodes an 
ep-tope-be^^^ 

of a polypeptide comprising amino acid residues from about amino acid -1 to 
about 65 in SEQ ID NO:2; a polypeptide comprising amino acid residues from 
about 71 to about 105 in SEQ ID NO:2 ; a polypeptide comprising amino acid 
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residues from about 1 14 to about 136 in SEQ ID NO:2; a polypeptide comprising 
amino acid residues from about 148 to about 169 in SEQ ID NO:2; a polypeptide 
comprising amino acid residues from about 174 to about 198 in SEQ ID NO:2; 
a polypeptide comprising amino acid residues from about 213 to about 229 in 
SEQ ID NO:2; a polypeptide comprising amino acid residues from about 234 to 
about 253 in SEQ ID NO:2; and a polypeptide comprising amino acid residues 
from about 267 to about 3 15 in SEQ ID NO:2. 

11. An isolated nucleic acid molecule, comprising a polynucleotide 
having a sequence selected from the group consisting of: 

(a) the nucleotide sequence of a fragment, wherein said 
fragment comprises at least 50 contiguous nucleotides of the coding region of the 
sequence shown in SEQ ID NO.l, provided that said isolated nucleic acid 
molecule is not SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID 
NO:15, SEQIDNO:17, SEQIDNO:18, SEQIDNO.19, SEQIDNO:20, SEQ 
ID NO:21, SEQ ID NO:22, SEQ ID NO:23, SEQ ID NO:24, SEQ ID NO:25, 
SEQ ID NO:26, SEQ ID NO:27, SEQ ID NO:28, SEQ ID NO:29, SEQ ID 
NO:30, SEQ ID NO:3 1, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ 
ID NO:35, or SEQ ID NO:36, or any subfragment thereof; and 

(b) a nucleotide sequence complementary to a nucleotide 
sequence in (a). 

12. A method for making a recombinant vector comprising inserting 
an isolated nucleic acid molecule of claim 1 into a vector. 

13. A recombinant vector produced by the method of claim 12. 

14. A method of making a recombinant host cell comprising 
introducing the recombinant vector of claim 13 into a host cell. 
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15. A recombinant host cell produced by the method of claim 1 4. 

16. A recombinant method for producing a CESP polypeptide, 
comprising culturing the recombinant host cell of claim 15 under conditions such 
that said polypeptide is expressed and recovering said polypeptide. 

1 7. An isolated CESP polypeptide having an amino acid sequence at 
least 95% identical to a sequence selected from the group consisting of: 

(a) amino acids about -21 to about 329 in SEQ ID NO:2; 

(b) amino acids about -20 to about 329 in SEQ ID NO:2; 

(c) amino acids about 1 to about 329 in SEQ ID NO:2; 

(d) the amino acid sequence of the CESP polypeptide having the 
complete amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 97728; and 

(e) the amino acid sequence of the mature CESP polypeptide having 
the amino acid sequence encoded by the cDNA clone contained in ATCC Deposit 
No. 97728; and 

(f) the amino acid sequence of an epitope-bearing portion of any one 
of the polypeptides of (a), (b), (c), (d), or (e). 

18. An isolated polypeptide comprising an epitope-bearing portion of 
the CESP protein, wherein said portion is selected from the group consisting of: 
a polypeptide comprising amino acid residues from about amino acid -1 to about 
65 in SEQ ID NO:2; a polypeptide comprising amino acid residues from about 71 
to about 105 in SEQ ID NO:2; a polypeptide comprising amino acid residues from 
about 1 14 to about 136 in SEQ ID NO:2; a polypeptide comprising amino acid 
residues from about 148 to about 169 in SEQ ID NO:2; a polypeptide comprising 
amino acid residues from about 174 to about 198 in SEQ ID NO:2; a polypeptide 
comprising amino acid residues from about 213 to about 229 in SEQ ID NO:2; 
a polypeptide comprising amino acid residues from about 234 to about 253 in 
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SEQ ID NO:2; and a polypeptide comprising amino acid residues from about 267 
to about 3 1 5 in SEQ ID NO:2. 

19. An isolated nucleic acid molecule comprising a polynucleotide 
encoding a CESP polypeptide wherein, except for one to fifty conservative amino 
acid substitutions, said polypeptide has a sequence selected from the group 
consisting of: 

(a) a nucleotide sequence encoding the amino acid sequence at 
position about -21 to about 329 in SEQ ID NO:2; 

(b) a nucleotide sequence encoding the amino acid sequence at 
position about -20 to about 329 in SEQ ID NO:2; 

(c) a nucleotide sequence encoding the amino acid sequence at 
position about 1 to about 329 in SEQ ID NO:2; 

(d) a nucleotide sequence encoding the CESP polypeptide having the 
complete amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 97728; 

(e) a nucleotide sequence encoding the mature CESP polypeptide 
having the amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 97728; and 

(f) a nucleotide sequence complementary to any of the nucleotide 
sequences in (a), (b), (c), (d), or (e). 

20. An isolated CESP polypeptide wherein, except for one to fifty 
conservative amino acid substitutions, said polypeptide has a sequence selected 
from the group consisting of: 

(a) amino acids about -2 1 to about 329 in SEQ ID NO:2; 

(b) amino acids about -20 to about 329 in SEQ ID NO:2; 

(c) amino acids about 1 to about 329 in SEQ ID NO:2; 
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(d) the amino acid sequence of the CESP polypeptide having the 
complete amino acid sequence encoded by the cDNA clone contained in ATCC 
Deposit No. 97728; and 

(e) the amino acid sequence of the mature CESP polypeptide having 
the amino acid sequence encoded by the cDNA clone contained in ATCC Deposit 
No. 97728; and 

(f) the amino acid sequence of an epitope-bearing portion of any one 
of the polypeptides of (a), (b), (c), (d), or (e). 
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10 30 50 

-72 AGAGGATCCGGGTTCGGTTGCTCTGGCGAGGGCTCCAGCATCACAGGCGGCGGCTGCGGG - 13 

70 90 110 

- 12 CGCAGAGCGGAGATGCAGCGGCTTGGGGCCACCCTGCTGTGCCTGCTGCTGGCGGCGGCG 47 
-3 MQRLGATI I CI 1 I AAA 16 

130 150 170 

48 GTCCCCACGGCCCCCGCGCCCGCTCCGACGGCGACCTCGGCTCCAGTCAAGCCCGGCCCG 107 
17 V P T A P APAPTATSAPVKPGP 36 

190 210 230 

108 GCTCTCAGCTACCCGCAGGAGGAGGCCACCCTCAATGAGATGTTCCGCGAGGTTGAGGAA 167 
37 ALSYPQEEATLNEMFREVEE 56 

250 270 290 

168 CTGATGGAGGACACGCAGCACAAATTGCGCAGCGCGGTGGAAGAGATGGAGGCAGAAGAA 227 
57 LMEDTQHKLRSAVEEMEAEE 76 

310 330 350 

228 GCTGCTGCTAAAGCATCATCAGAAGTGMCCTGGCAAACTTACCTCCCAGCTATCACAAT 287 
77 AAAKASSEVNLANLPPSYHN 96 

370 390 410 

288 GAGACCAACACAGACACGAAGGTTGGAAATAATACCATCCATGTGCACCGAGAAATTCAC 347 
97 ETNTDTKVGNNTIHVHREIH 116 

430 450 470 

348 AAGATAACCAACAACCAGACTGGACAAATGGTCTTTTCAGAGACAGTTATCACATCTGTG 407 
117 KITNNQTGQMVFSETVITSV 136 

490 510 530 

408 GGAGACGAAGAAGGCAGAAGGAGCCACGAGTGCATCATCGACGAGGACTGTGGGCCCAGC 467 
137 GDEEGRRSHECI IDEDCGPS 156 

550 570 590 

468 ATGTACTGCCAGTTTGCCAGCTTCCAGTACACCTGCCAGCCATGCCGGGGCCAGAGGATG 527 
157 MYCQFASFQYTCQPCRGQRM 176 

610 630 650 

528 CTCTGCACCCGGGACAGTGAGTGCTGTGGAGACCAGCTGTGTGTCTGGGGTCACTGCACC 587 
177 LCTRDSECCGDQLCVWGHCT 196 

670 690 710 

588 AAAATGGCCACCAGGGGCAGCAATGGGACCATCTGTGACAACCAGAGGGACTGCCAGCCG 647 
197 KMATRGSNGTICDNQRDCQP 216 

730 750 770 

648 GGGCTGTGCTGTGCCTTCCAGAGAGGCCTGCTGTTCCCTGTGTGCACACCCCTGCCCGTG 707 
217 GLCCAFQRGLLFPVCTPLPV 236 

790 810 830 

708 GAGGGCGAGCTTTGCCATGACCCCGCCAGCCGGCTTCTGGACCTCATCACCTGGGAGCTA 767 
237 EGELCHDPASRLLDL ITWEL 256 

850 870 890 

768 GAGCCTGATGGAGCCTTGGACCGATGCCCTTGTGCCAGTGGCCTCCTCTGCCAGCCCCAC 827 
257 EPDGALDRCPCASGLLCQPH 276 
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910 930 950 

828 AGCCACAGCCTGGTGTATGTGTGCAAGCCGACCTTCGTGGGGAGCCGTGACCAAGATGGG 887 
277 SHSLVYVCKPTFVGSRDQDG 296 

970 990 1010 

888 GAGATCCTGCTGCCCAGAGAGGTCCCCGATGAGTATGAAGTTGGCAGCTTCATGGAGGAG 947 
297 E I L L P R E V P 0 E Y E V G S F M E E 316 

1030 1050 1070 

948 GTGCGCCAGGAGCTGGAGGACCTGGAGAGGAGCCTGACTGAAGAGATGGCGCTGGGGGAG 1007 
317 VRQELEDLERSLTEEMALGE 336 

1090 1110 H30 

1008 CCTGCGGCTGCCGCCGCTGCACTGCTGGGAGGGGAAGAGATTTAGATCTGGACCAGGCTG 1067 
337 P A A A A A A L L G G E E I * 350 

_ 1150 1170 H90 

1068 TGGGTAGATGTGCAATAGAAATAGCTAAnTATTTCCCCAGGTGTGTGCTTTAGGCGTGG 1127 

1210 1230 1250 

1 128 GCTGACCAGGCTTCTTCCTACATCTTCTTCCCAGTAAGTTTCCCCTCTGGCTTGACAGCA 1 187 

1270 1290 1310 

1188 TGAGGTGTTGTGCATTTGTTCAGCTCCCCCAGGCTGTTCTCCAGGCTTCACAGTCTGGTG 1247 

1330 1350 1370 

1248 CTTGGGAGAGTCAGGCAGGGTTAAACTGCAGGAGCAGTTTGCCACCCCTGTCCAGATTAT 1307 

1390 1410 1430 

1308 TGGCTGCTTTGCCTCTACCAGnGGCAGACAGCCGTTTGTTCTACATGGCTTTGATAATT 1367 

1450 1470 1490 

1368 GTTTGAGGGGAGGAGATGGAMCMTGTGGAGTCTCCCTCTGATTGGTTTTGGGGAAATG 1427 

1510 1530 1550 

1428 TGGAGMGAGTGCCCTGCTTTGCAMCATCMCCTGGCAAAAATGCAACAAATGAATTTT 1487 

, lon „ 1570 1590 1610 

1488 CCACGCAGTTCTTTCCATGGGCATAGGTAAGCTGTGCCTTCAGCTGTTGCAGATGAAATG 1547 

1630 1650 1670 

1548 TTCTGTTCACCCTGCATTACATGTGTTTATTCATCCAGCAGTGTTGCTCAGCTCCTACCT 1607 

1690 1710 1730 

1608 CTGTGCCAGGG(^GCATTTTCATATCCAAGATCMnCCCTCTCTCAGCACAGCCTGGGG 1667 

1750 1770 1790 

1668 AGGGGGTCATTGTTCTCCTCGTCCATCAGGGATCTCAGAGGCTCAGAGACTGCAAGCTGC 1727 

1810 1830 1850 

1728 TTGCCCAAGTCACACAGCTAGTGAAGACCAGAGCAGTTTCATCTGGTTGTGACTCTAAGC 1787 

1870 1890 1910 

FIG.1B 
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1788 TCA6TGCTCTCTCCACTACCCCACACCAGCCTTGGTGCCACCAAAAGTGCTCCCCAAAAG 1847 

1930 1950 1970 

1848 GMGGAGMTGGGATTTTTCTTTTGAGGCATGCACATCTGGAATTAAGGTCAAACTAATT 1907 

1990 2010 2030 

1908 CTCACATCCCTCTAAAAGTAAACTACTGTTAGGAACAGCAGTGTTCTCACAGTGTGGGGC 1967 

2050 2070 2090 

1968 AGCCGTCCTTCTAATGAAGACAATGATATTGACACTGTCCCTCTTTGGCAGTTGCATTAG 2027 

2110 2130 2150 

2028 TAACTTTGAAAGGTATATGACTGAGCGTAGCATACAGGTTAACCTGCAGAAACAGTACTT 2087 

2170 2190 2210 

2088 AGGTAATTGTAGGGCGAGGATTATAAATGAAATTTGCAAAATCACTTAGCAGCAACTGAA 2147 

2230 2250 2270 

2148 GACAATTATCAACCACGTGGAGAAAATCAAACCGAGCAGTGCTGTGTGAAACATGGTTGT 2207 

2290 2310 2330 

2208 AATATGCGACTGCGAACACTGAACTCTACGCCACTCCACAAATGATGTTTTCAGGTGTCA 2267 

2350 2370 2390 

2268 TGGACTGTTGCCACCATGTATTCATCCAGAGTTCTTAAAGTTTAAAGTTGCACATGATTG 2327 

2410 2430 2450 

2328 TATMGCATGCnTCTnGAGTTnAAAnATGTATAAACATAAGnGCATTTAGAAATC 2387 



2388 



2470 2490 
AAGCATAAATCACTTCAACTGCTAAAAAAA 



2417 



FIG.1C 
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5 GATLLCLLLAMVPTAPAPAPTATSAPVKPGPALSYPQEEATLNEMFREV 54 
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